preprocess.data {bets.covid19} | R Documentation |
Prepare data frame for analysis
Description
Prepare data frame for analysis
Usage
preprocess.data(
data,
infected_in = c("Wuhan", "Outside"),
symptom_impute = FALSE
)
Arguments
data |
A data frame |
infected_in |
Either "Wuhan" or "Outside" |
symptom_impute |
Whether to use initial medical visit and confirmation to impute missing symptom onset. |
Details
A summary of the procedures:
Convert all dates to number of days since 1-Dec-2019.
Separates data into those returned from Wuhan and those infected outside of wuhan.
Restrict to cases with a known symptom onset date.
Parse column 'Infected' into two columns: Infected_first and Infected_last.
For all cases, set Infected_first to 1 if it is missing.
For outside cases, set Infected_last to be no later than symptom onset.
For Wuhan-exported cases, set Infected_last to no later than symptom onset and end of Wuhan stay.
Value
A data frame
Author(s)
Nianqiao Ju <nju@g.harvard.edu>, Qingyuan Zhao <qyzhao@statslab.cam.ac.uk>
Examples
data(covid19_data)
head(data <- preprocess.data(covid19_data))
## This is how the wuhan_exported data frame is created
data <- subset(data, Symptom < Inf)
data <- subset(data, Arrived <= 54)
data$Location <- do.call(rbind, strsplit(as.character(data$Case), "-"))[, 1]
wuhan_exported <- data.frame(Location = data$Location,
B = data$Begin_Wuhan,
E = data$End_Wuhan,
S = data$Symptom)
## devtools::use_data(wuhan_exported)