generate.ff.data {robflreg}R Documentation

Generate functional data for the function-on-function regression model

Description

This function provides a unified simulation structure for the function-on-function regression model

Y(t) = \sum_{m=1}^M \int X_m(s) \beta_m(s,t) ds + \epsilon(t),

where Y(t) denotes the functional response, X_m(s) denotes the m-th functional predictor, \beta_m(s,t) denotes the m-th bivariate regression coefficient function, and \epsilon(t) is the error function.

Usage

generate.ff.data(n.pred, n.curve, n.gp, out.p = 0)

Arguments

n.pred

An integer, denoting the number of functional predictors to be generated.

n.curve

An integer, specifying the number of observations for each functional variable to be generated.

n.gp

An integer, denoting the number of grid points, i.e., a fine grid on the interval [0, 1].

out.p

An integer between 0 and 1, denoting the outlier percentage in the generated data.

Details

In the data generation process, first, the functional predictors are simulated based on the following process:

X_m(s) = \sum_{j=1}^5 \kappa_j v_j(s),

where \kappa_j is a vector generated from a Normal distribution with mean one and variance \sqrt{a} j^{-1/2}, a is a uniformly generated random number between 1 and 4, and

v_j(s) = \sin(j \pi s) - \cos(j \pi s).

The bivariate regression coefficient functions are generated from a coefficient space that includes ten different functions such as:

b \sin(2 \pi s) \sin(\pi t)

and

b e^{-3 (s - 0.5)^2} e^{-4 (t - 1)^2},

where b is generated from a uniform distribution between 1 and 3. The error function \epsilon(t), on the other hand, is generated from the Ornstein-Uhlenbeck process:

\epsilon(t) = l + [\epsilon_0(t) - l] e^{-\theta t} + \sigma \int_0^t e^{-\theta (t-u)} d W_u,

where l, \theta > 0, \sigma > 0 are constants, \epsilon_0(t) is the initial value of \epsilon(t) taken from W_u, and W_u is the Wiener process. If outliers are allowed in the generated data, i.e., out.p > 0, then, the randomly selected n.curve \times out.p of the data are generated in a different way from the aforementioned process. In more detail, if out.p > 0, the bivariate regression coefficient functions (possibly different from the previously generated coefficient functions) generated from the coefficient space with b^* (instead of b), where b^* is generated from a uniform distribution between 1 and 2, are used to generate the outlying observations. In addition, in this case, the following process is used to generate functional predictors:

X_m^*(s) = \sum_{j=1}^5 \kappa_j^* v_j^*(s),

where \kappa_j^* is a vector generated from a Normal distribution with mean one and variance \sqrt{a} j^{-3/2} and

v_j^*(s) = 2 \sin(j \pi s) - \cos(j \pi s).

All the functions are generated equally spaced point in the interval [0, 1].

Value

A list object with the following components:

Y

An n.curve \times n.gp-dimensional matrix containing the observations of simulated functional response variable.

X

A list with length n.pred. The elements are the n.curve \times n.gp-dimensional matrices containing the observations of simulated functional predictor variables.

f.coef

A list with length n.pred. Each element is a matrix and contains the generated bivariate regression coefficient function.

out.indx

A vector with length n.curve \times out.p denoting the indices of outlying observations.

Author(s)

Ufuk Beyaztas and Han Lin Shang

References

E. Garcia-Portugues and J. Alvarez-Liebana J and G. Alvarez-Perez G and W. Gonzalez-Manteiga W (2021) "A goodness-of-fit test for the functional linear model with functional response", Scandinavian Journal of Statistics, 48(2), 502-528.

Examples

library(fda)
library(fda.usc)
set.seed(2022)
sim.data <- generate.ff.data(n.pred = 5, n.curve = 200, n.gp = 101, out.p = 0.1)
Y <- sim.data$Y
X <- sim.data$X
coeffs <- sim.data$f.coef
out.indx <- sim.data$out.indx
fY <- fdata(Y, argvals = seq(0, 1, length.out = 101))
plot(fY[-out.indx,], lty = 1, ylab = "", xlab = "Grid point", 
     main = "Response", mgp = c(2, 0.5, 0), ylim = range(fY))
lines(fY[out.indx,], lty = 1, col = "black") # Outlying functions

[Package robflreg version 1.2 Index]