generate.sf.data {robflreg} | R Documentation |
Generate functional data for the scalar-on-function regression model
Description
This function is used to simulate data for the scalar-on-function regression model
where denotes the scalar response,
denotes the
-th functional predictor,
denotes the
-th regression coefficient function, and
is the error process.
Usage
generate.sf.data(n, n.pred, n.gp, out.p = 0)
Arguments
n |
An integer, specifying the number of observations for each variable to be generated. |
n.pred |
An integer, denoting the number of functional predictors to be generated. |
n.gp |
An integer, denoting the number of grid points, i.e., a fine grid on the interval [0, 1]. |
out.p |
An integer between 0 and 1, denoting the outlier percentage in the generated data. |
Details
In the data generation process, first, the functional predictors are simulated based on the following process:
where is a vector generated from a Normal distribution with mean one and variance
,
is a uniformly generated random number between 1 and 4, and
The regression coefficient functions are generated from a coefficient space that includes ten different functions such as:
and
where is generated from a uniform distribution between 1 and 3. The error process is generated from the standard normal distribution. If outliers are allowed in the generated data, i.e.,
, then, the randomply selected
of the data are generated in a different way from the aforementioned process. In more detail, if
, the regression coefficient functions (possibly different from the previously generated coefficient functions) generated from the coefficient space with
(instead of
), where
is generated from a uniform distribution between 3 and 5, are used to generate the outlying observations. In addition, in this case, the following process is used to generate functional predictors:
where is a vector generated from a Normal distribution with mean one and variance
and
Moreover, the error process is generated from a normal distribution with mean 1 and variance 1. All the functional predictors are generated equally spaced point in the interval .
Value
A list object with the following components:
Y |
An |
X |
A list with length n.pred. The elements are the |
f.coef |
A list with length n.pred. Each element is a vector and contains the generated regression coefficient function. |
out.indx |
A vector with length |
Author(s)
Ufuk Beyaztas and Han Lin Shang
Examples
library(fda.usc)
library(fda)
set.seed(2022)
sim.data <- generate.sf.data(n = 400, n.pred = 5, n.gp = 101, out.p = 0.1)
Y <- sim.data$Y
X <- sim.data$X
coeffs <- sim.data$f.coef
out.indx <- sim.data$out.indx
plot(Y[-out.indx,], type = "p", pch = 16, xlab = "Index", ylab = "",
main = "Response", ylim = range(Y))
points(out.indx, Y[out.indx,], type = "p", pch = 16, col = "blue") # Outliers
fX1 <- fdata(X[[1]], argvals = seq(0, 1, length.out = 101))
plot(fX1[-out.indx,], lty = 1, ylab = "", xlab = "Grid point",
main = expression(X[1](s)), mgp = c(2, 0.5, 0), ylim = range(fX1))
lines(fX1[out.indx,], lty = 1, col = "black") # Leverage points