Data for cleaning {epiDisplay} | R Documentation |
Dataset for practicing cleaning, labelling and recoding
Description
The data come from clients of a family planning clinic.
For all variables except id: 9, 99, 99.9, 888, 999 represent missing values
Usage
data(Planning)
Format
A data frame with 251 observations on the following 11 variables.
ID
a numeric vector: ID code
AGE
a numeric vector
RELIG
a numeric vector: Religion
1 | = Buddhist | |
2 | = Muslim | |
PED
a numeric vector: Patient's education level
1 | = none | |
2 | = primary school | |
3 | = secondary school | |
4 | = high school | |
5 | = vocational school | |
6 | = university | |
7 | = other | |
INCOME
a numeric vector: Monthly income in Thai Baht
1 | = nil | |
2 | = < 1,000 | |
3 | = 1,000-4,999 | |
4 | = 5,000-9,999 | |
5 | = 10,000 | |
AM
a numeric vector: Age at marriage
REASON
a numeric vector: Reason for family planning
1 | = birth spacing | |
2 | = enough children | |
3 | = other | |
BPS
a numeric vector: systolic blood pressure
BPD
a numeric vector: diastolic blood pressure
WT
a numeric vector: weight (Kg)
HT
a numeric vector: height (cm)
Examples
data(Planning)
des(Planning)
# Change var. name to lowercase
names(Planning) <- tolower(names(Planning))
.data <- Planning
des(.data)
# Check for duplication of 'id'
attach(.data)
any(duplicated(id))
duplicated(id)
id[duplicated(id)] #215
# Which one(s) are missing?
setdiff(min(id):max(id), id) # 216
# Correct the wrong on
id[duplicated(id)] <- 216
detach(.data)
rm(list=ls())
[Package epiDisplay version 3.5.0.2 Index]