R: A non-probabilistic sample

sampleNP {NonProbEst}

R Documentation

A non-probabilistic sample

Description

A dataset of 1000 individuals extracted from the subpopulation of individuals with internet access in a simulated fictitious population of 50,000 individuals. This sample attempts to reproduce a case of nonprobability sampling with selection bias, as there are important differences between the potentially covered population, the covered population and the full target population. Further details on the generation of the dataset can be found in Ferri-García and Rueda (2018). The variables present in the dataset are the following:

vote_gen. A binary variable indicating if the individual vote preferences are for Party 1. This variable is related to gender.
vote_pens. A binary variable indicating if the individual vote preferences are for Party 2. This variable is related to age.
vote_pir. A binary variable indicating if the individual vote preferences are for Party 3. This variable is related to age and internet access.
education_primaria. A binary variable indicating if the highest academic level achieved by the individual is Primary Education.
education_secundaria. A binary variable indicating if the highest academic level achieved by the individual is Secondary Education.
education_terciaria. A binary variable indicating if the highest academic level achieved by the individual is Tertiary Education.
age. A numeric variable, with values ranging from 18 to 100, indicating the age of the individual.
sex. A binary variable indicating if the individual is a man.
language. A binary variable indicating if the individual is a native.

Usage

sampleNP

Format

An object of class data.frame with 1000 rows and 9 columns.

References

Ferri-García, R., & Rueda, M. (2018). Efficiency of propensity score adjustment and calibration on the estimation from non-probabilistic online surveys. SORT-Statistics and Operations Research Transactions, 1(2), 159-162.

[Package NonProbEst version 0.2.4 Index]