plot_missing {states} | R Documentation |
Visualize missing and non-proper cases for state panel data
Description
Plot missing values by country and date, and additionally identify country-date cases that do or do not match an independent state list.
Usage
plot_missing(
data,
x = NULL,
ccode = NULL,
time = NULL,
period = NULL,
statelist = NULL,
partial = "any",
skip_labels = 5,
space = deprecated()
)
missing_info(
data,
x = NULL,
ccode = NULL,
time = NULL,
period = NULL,
statelist = NULL,
partial = NULL,
space = deprecated()
)
Arguments
data |
State panel data frame |
x |
Variable names(s), e.g. "x" or c("x1", "x2"). Default is NULL, in which case all columns expect the ccode and time ID columns will be used. |
ccode |
Name of variable identifying state country codes. If NULL (default) and one of "gwcode" or "cowcode" is a column in the data, it will be used. |
time |
Name of time identifier. If NULL and a "date" or "year" column are in the data, they will be used ("year", preferentially, if both are present) |
period |
Time period in which the data are. NULL by default and inferred
to be "year" if the "time" column has name "year" or contains integers with
a range between 1799 and 2050. Required if the "time" column is a
|
statelist |
Check not only missing values, but presence or absence of observations against a list of independent states? One of "GW", "COW" or "none". NULL by default, in which case it will be inferred if the ccode columns have the name "gwcode" or "cowcode", and "none" otherwise. |
partial |
Option for how to handle edge cases where a state is independent
for only part of a time period (year, month, etc.). Options include
"exact", and "any". See |
skip_labels |
Only plot the label for every n-th country on the y-axis to avoid overplotting. |
space |
Deprecated, use "ccode" argument instead. |
Details
missing_info
provides the information that is plotted with
plot_missing
. The latter returns a ggplot, and thus can be chained
with other ggplot functions as usual.
Value
plot_missing
returns a ggplot2 object.
missing_info
returns a data frame with components:
ccode |
ccode identifier, with name equal to the "ccode" argument, e.g. "ccode". |
time |
Time identifier, with name equal to the "time" argument, e.g. "date". |
independent |
A logical vector, is the statelist argument is none, NA. |
missing_value |
A logical vector indicating if that record has missing values |
status |
The label used for plotting, combining the independence and missing value information for a case as appropriate. |
Examples
# Create an example data frame with missing values
cy <- state_panel(as.Date("1980-06-30"), as.Date("2015-06-30"), by = "year",
useGW = TRUE)
cy$myvar <- rnorm(nrow(cy))
set.seed(1234)
cy$myvar[sample(1:nrow(cy), nrow(cy)*.1, replace = FALSE)] <- NA
str(cy)
# Visualize missing values:
plot_missing(cy, statelist = "none")
# missing_info() generates the data underlying plot_missing():
head(missing_info(cy, statelist = "none"))
# if we specify a statelist to check against, 'independent' will have values
# now:
head(missing_info(cy, statelist = "GW"))
# Check data also against G&W list of independent states
head(missing_info(cy, statelist = "GW"))
plot_missing(cy, statelist = "GW")
# Live example with Polity data
data("polity")
head(polity)
plot_missing(polity, x = "polity", ccode = "ccode", time = "year",
statelist = "COW")
# COW starts in 1816; Polity has excess data for several non-independent
# states after that date, and is missing coverage for several countries.
# The date option is relevant for years in which states gain or lose
# independence, so this will be slighlty different:
polity$date <- as.Date(paste0(polity$year, "-01-01"))
polity$year <- NULL
plot_missing(polity, x = "polity", ccode = "ccode", time = "date",
period = "year", statelist = "COW")
# plot_missing returns a ggplot2 object, so you can do anything you want
polity$year <- as.integer(substr(polity$date, 1, 4))
polity$date <- NULL
plot_missing(polity, ccode = "ccode", statelist = "COW") +
ggplot2::coord_flip()