sum_lump {migest} | R Documentation |
Sum and lump together small flows into a "other" category
Description
Lump together regions/countries if their flows are below a given threshold.
Usage
sum_lump(
m,
threshold = 1,
lump = "flow",
other_level = "other",
complete = FALSE,
fill = 0,
return_matrix = TRUE,
orig_col = "orig",
dest_col = "dest",
flow_col = "flow"
)
Arguments
m |
A |
threshold |
Numeric value used to determine small flows, origins or destinations that will be grouped (lumped) together. |
lump |
Character string to indicate where to apply the threshold. Choose from the |
other_level |
Character string for the origin and/or destination label for the lumped values below the |
complete |
Logical value to return a |
fill |
Numeric value for to fill small cells below the |
return_matrix |
Logical to return a matrix. Default |
orig_col |
Character string of the origin column name (when |
dest_col |
Character string of the destination column name (when |
flow_col |
Character string of the flow column name (when |
Details
The lump
argument can take values flow
or bilat
to apply the threshold to the data values for between region migration, in
or imm
to apply the threshold to the incoming region region and out
or emi
to apply the threshold to outgoing region region.
Value
A tibble
with an additional other
origins and/or destinations region based on the grouping together of small values below the threshold
argument and the lump
argument to indicate on where to apply the threshold.
Examples
r <- LETTERS[1:4]
m <- matrix(data = c(0, 100, 30, 10, 50, 0, 50, 5, 10, 40, 0, 40, 20, 25, 20, 0),
nrow = 4, ncol = 4, dimnames = list(orig = r, dest = r), byrow = TRUE)
m
# threshold on in and out region
sum_lump(m, threshold = 100, lump = c("in", "out"))
# threshold on flows (default)
sum_lump(m, threshold = 40)
# return a matrix (only possible when input is a matrix and
# complete = TRUE) with small values replaced by zeros
sum_lump(m, threshold = 50, complete = TRUE)
# return a data frame with small values replaced with zero
sum_lump(m, threshold = 80, complete = TRUE, return_matrix = FALSE)
## Not run:
# data frame (tidy) format
library(tidyverse)
# download Abel and Cohen (2019) estimates
f <- read_csv("https://ndownloader.figshare.com/files/38016762", show_col_types = FALSE)
f
# large 1990-1995 flow estimates
f %>%
filter(year0 == 1990) %>%
sum_lump(flow_col = "da_pb_closed", threshold = 1e5)
# large flow estimates for each year
f %>%
group_by(year0) %>%
sum_lump(flow_col = "da_pb_closed", threshold = 1e5)
## End(Not run)