plm.fast {plm} | R Documentation |
Option to Switch On/Off Fast Data Transformations
Description
A significant speed up can be gained by using fast (panel) data transformation
functions from package collapse
.
An additional significant speed up for the two-way fixed effects case can be
achieved if package fixest
or lfe
is installed (package collapse
needs to be installed for the fast mode in any case).
Details
By default, this speed up is enabled.
Option plm.fast
can be used to enable/disable the speed up. The option is
evaluated prior to execution of supported transformations (see below), so
option("plm.fast" = TRUE)
enables the speed up while
option("plm.fast" = FALSE)
disables the speed up.
To have it always switched off, put options("plm.fast" = FALSE)
in your
.Rprofile file.
See Examples for how to use the option and for a benchmarking example.
For long, package plm
used base R implementations and R-based code. The
package collapse
provides fast data transformation functions written
in C/C++, among them some especially suitable for panel data.
Having package collapse
installed is a requirement for the speed up, so
this package is a hard dependency for package plm
.
Availability of packages fixest
and lfe
is checked for once when
package plm is attached and the additional speed up for the two-way fixed
effect case is enabled automatically (fixest
wins over lfe
),
given one of the packages is detected and options("plm.fast" = TRUE)
(default) is set. If so, the packages' fast algorithms to partial out fixed
effects are used (fixest::demean
(via collapse::fhdwithin
),
lfe::demeanlist
). Both packages are 'Suggests' dependencies.
Users might experience neglectable numerical differences between enabled and disabled fast mode and base R implementation, depending on the platform and the additional packages installed.
Currently, these basic functions benefit from the speed-up, used as building
blocks in most model estimation functions, e.g., in plm
(more functions are
under investigation):
between,
Between,
Sum,
Within,
lag, lead, and diff,
pseriesfy,
pdiff (internal function).
Examples
## Not run:
### A benchmark of plm without and with speed-up
library("plm")
library("collapse")
library("microbenchmark")
rm(list = ls())
data("wlddev", package = "collapse")
form <- LIFEEX ~ PCGDP + GINI
# produce big data set (taken from collapse's vignette)
wlddevsmall <- get_vars(wlddev, c("iso3c","year","OECD","PCGDP","LIFEEX","GINI","ODA"))
wlddevsmall$iso3c <- as.character(wlddevsmall$iso3c)
data <- replicate(100, wlddevsmall, simplify = FALSE)
rm(wlddevsmall)
uniquify <- function(x, i) {
x$iso3c <- paste0(x$iso3c, i)
x
}
data <- unlist2d(Map(uniquify, data, as.list(1:100)), idcols = FALSE)
data <- pdata.frame(data, index = c("iso3c", "year"))
pdim(data) # Balanced Panel: n = 21600, T = 59, N = 1274400 // but many NAs
# data <- na.omit(data)
# pdim(data) # Unbalanced Panel: n = 13300, T = 1-31, N = 93900
times <- 1 # no. of repetitions for benchmark - this takes quite long!
onewayFE <- microbenchmark(
{options("plm.fast" = FALSE); plm(form, data = data, model = "within")},
{options("plm.fast" = TRUE); plm(form, data = data, model = "within")},
times = times)
summary(onewayFE, unit = "relative")
## two-ways FE benchmark requires pkg fixest and lfe
## (End-users shall only set option plm.fast. Option plm.fast.pkg.FE.tw shall
## _not_ be set by the end-user, it is determined automatically when pkg plm
## is attached; however, it needs to be set explicitly in this example for the
## benchmark.)
if(requireNamespace("fixest", quietly = TRUE) &&
requireNamespace("lfe", quietly = TRUE)) {
twowayFE <- microbenchmark(
{options("plm.fast" = FALSE);
plm(form, data = data, model = "within", effect = "twoways")},
{options("plm.fast" = TRUE, "plm.fast.pkg.FE.tw" = "collapse");
plm(form, data = data, model = "within", effect = "twoways")},
{options("plm.fast" = TRUE, "plm.fast.pkg.FE.tw" = "fixest");
plm(form, data = data, model = "within", effect = "twoways")},
{options("plm.fast" = TRUE, "plm.fast.pkg.FE.tw" = "lfe");
plm(form, data = data, model = "within", effect = "twoways")},
times = times)
summary(twowayFE, unit = "relative")
}
onewayRE <- microbenchmark(
{options("plm.fast" = FALSE); plm(form, data = data, model = "random")},
{options("plm.fast" = TRUE); plm(form, data = data, model = "random")},
times = times)
summary(onewayRE, unit = "relative")
twowayRE <- microbenchmark(
{options("plm.fast" = FALSE); plm(form, data = data, model = "random", effect = "twoways")},
{options("plm.fast" = TRUE); plm(form, data = data, model = "random", effect = "twoways")},
times = times)
summary(twowayRE, unit = "relative")
## End(Not run)