fnn.fit {FuncNN}R Documentation

Fitting Functional Neural Networks

Description

This is the main function in the FuncNN package. This function fits models of the form: f(z, b(x)) where z are the scalar covariates and b(x) are the functional covariates. The form of f() is that of a neural network with a generalized input space.

Usage

fnn.fit(
  resp,
  func_cov,
  scalar_cov = NULL,
  basis_choice = c("fourier"),
  num_basis = c(7),
  hidden_layers = 2,
  neurons_per_layer = c(64, 64),
  activations_in_layers = c("sigmoid", "linear"),
  domain_range = list(c(0, 1)),
  epochs = 100,
  loss_choice = "mse",
  metric_choice = list("mean_squared_error"),
  val_split = 0.2,
  learn_rate = 0.001,
  patience_param = 15,
  early_stopping = TRUE,
  print_info = TRUE,
  batch_size = 32,
  decay_rate = 0,
  func_resp_method = 1,
  covariate_scaling = TRUE,
  raw_data = FALSE,
  dropout = FALSE
)

Arguments

resp

For scalar responses, this is a vector of the observed dependent variable. For functional responses, this is a matrix where each row contains the basis coefficients defining the functional response (for each observation).

func_cov

The form of this depends on whether the raw_data argument is true or not. If true, then this is a list of k matrices. The dimensionality of the matrices should be the same (n x p) where n is the number of observations and p is the number of longitudinal observations. If raw_data is false, then the input should be a tensor with dimensionality b x n x k where b is the number of basis functions used to define the functional covariates, n is the number of observations, and k is the number of functional covariates.

scalar_cov

A matrix contained the multivariate information associated with the data set. This is all of your non-longitudinal data.

basis_choice

A vector of size k (the number of functional covariates) with either "fourier" or "bspline" as the inputs. This is the choice for the basis functions used for the functional weight expansion. If you only specify one, with k > 1, then the argument will repeat that choice for all k functional covariates.

num_basis

A vector of size k defining the number of basis functions to be used in the basis expansion. Must be odd for fourier basis choices. If you only specify one, with k > 1, then the argument will repeat that choice for all k functional covariates.

hidden_layers

The number of hidden layers to be used in the neural network.

neurons_per_layer

Vector of size = hidden_layers. The u-th element of the vector corresponds to the number of neurons in the u-th hidden layer.

activations_in_layers

Vector of size = hidden_layers. The u-th element of the vector corresponds to the activation choice in the u-th hidden layer.

domain_range

List of size k. Each element of the list is a 2-dimensional vector containing the upper and lower bounds of the k-th functional weight.

epochs

The number of training iterations.

loss_choice

This parameter defines the loss function used in the learning process.

metric_choice

This parameter defines the printed out error metric.

val_split

A parameter that decides the percentage split of the inputted data set.

learn_rate

Hyperparameter that defines how quickly you move in the direction of the gradient.

patience_param

A keras parameter that decides how many additional epochs are eclipsed with minimal change in error before the learning process is stopped. This is only active if early_stopping = TRUE

early_stopping

If TRUE, then learning process will be halted early if error improvement isn't seen.

print_info

If TRUE, function will output information about the model as it is trained.

batch_size

Size of the batch for stochastic gradient descent.

decay_rate

A modification to the learning rate that decreases the learning rate as more and more learning iterations are completed.

func_resp_method

Set to 1 by default. In the future, this will be set to 2 for an alternative functional response approach.

covariate_scaling

If TRUE, then data will be internally scaled before model development.

raw_data

If TRUE, then user does not need to create functional observations beforehand. The function will internally take care of that pre-processing.

dropout

Keras parameter that randomly drops some percentage of the neurons in a given layer. If TRUE, then 0.1*layer_number will be dropped; instead, you can specify a vector equal to the number of layers specifying what percentage to drop in each layer.

Details

Updates coming soon.

Value

The following are returned:

model – Full keras model that can be used with any functions that act on keras models.

data – Adjust data set after scaling and appending of scalar covariates.

fnc_basis_num – A return of the original input; describes the number of functions used in each of the k basis expansions.

fnc_type – A return of the original input; describes the basis expansion used to make the functional weights.

parameter_info – Information associated with hyperparameter choices in the model.

per_iter_info – Change in error over training iterations

func_obs – In the case when raw_data is TRUE, the user may want to see the internally developed functional observations. This returns those functions.

Examples

 
# First, an easy example with raw_data = TRUE


# Loading in data
data("daily")

# Functional covariates (subsetting for time sake)
precip = t(daily$precav)
longtidunal_dat = list(precip)

# Scalar Response
total_prec = apply(daily$precav, 2, mean)

# Running model
fit1 = fnn.fit(resp = total_prec,
               func_cov = longtidunal_dat,
               scalar_cov = NULL,
               learn_rate = 0.0001,
               epochs = 10,
               raw_data = TRUE)
               
# Classification Example with raw_data = TRUE

# Loading data
tecator = FuncNN::tecator

# Making classification bins
tecator_resp = as.factor(ifelse(tecator$y$Fat > 25, 1, 0))

# Non functional covariate
tecator_scalar = data.frame(water = tecator$y$Water)

# Splitting data
ind = sample(1:length(tecator_resp), round(0.75*length(tecator_resp)))
train_y = tecator_resp[ind]
test_y = tecator_resp[-ind]
train_x = tecator$absorp.fdata$data[ind,]
test_x = tecator$absorp.fdata$data[-ind,]
scalar_train = data.frame(tecator_scalar[ind,1])
scalar_test = data.frame(tecator_scalar[-ind,1])

# Making list element to pass in
func_covs_train = list(train_x)
func_covs_test = list(test_x)

# Now running model
fit_class = fnn.fit(resp = train_y,
                    func_cov = func_covs_train,
                    scalar_cov = scalar_train,
                    hidden_layers = 6,
                    neurons_per_layer = c(24, 24, 24, 24, 24, 58),
                    activations_in_layers = c("relu", "relu", "relu", "relu", "relu", "linear"),
                    domain_range = list(c(850, 1050)),
                    learn_rate = 0.001,
                    epochs = 100,
                    raw_data = TRUE,
                    early_stopping = TRUE)

# Running prediction, gets probabilities
predict_class = fnn.predict(fit_class,
                            func_cov = func_covs_test,
                            scalar_cov = scalar_test,
                            domain_range = list(c(850, 1050)),
                            raw_data = TRUE)

# Example with Pre-Processing (raw_data = FALSE)

# loading data
tecator = FuncNN::tecator

# libraries
library(fda)

# define the time points on which the functional predictor is observed.
timepts = tecator$absorp.fdata$argvals

# define the fourier basis
nbasis = 29
spline_basis = create.fourier.basis(tecator$absorp.fdata$rangeval, nbasis)

# convert the functional predictor into a fda object and getting deriv
tecator_fd =  Data2fd(timepts, t(tecator$absorp.fdata$data), spline_basis)
tecator_deriv = deriv.fd(tecator_fd)
tecator_deriv2 = deriv.fd(tecator_deriv)

# Non functional covariate
tecator_scalar = data.frame(water = tecator$y$Water)

# Response
tecator_resp = tecator$y$Fat

# Getting data into right format
tecator_data = array(dim = c(nbasis, length(tecator_resp), 3))
tecator_data[,,1] = tecator_fd$coefs
tecator_data[,,2] = tecator_deriv$coefs
tecator_data[,,3] = tecator_deriv2$coefs

# Splitting into test and train for third FNN
ind = 1:165
tec_data_train <- array(dim = c(nbasis, length(ind), 3))
tec_data_test <- array(dim = c(nbasis, nrow(tecator$absorp.fdata$data) - length(ind), 3))
tec_data_train = tecator_data[, ind, ]
tec_data_test = tecator_data[, -ind, ]
tecResp_train = tecator_resp[ind]
tecResp_test = tecator_resp[-ind]
scalar_train = data.frame(tecator_scalar[ind,1])
scalar_test = data.frame(tecator_scalar[-ind,1])

# Setting up network
tecator_fnn = fnn.fit(resp = tecResp_train,
                      func_cov = tec_data_train,
                      scalar_cov = scalar_train,
                      basis_choice = c("fourier", "fourier", "fourier"),
                      num_basis = c(5, 5, 7),
                      hidden_layers = 4,
                      neurons_per_layer = c(64, 64, 64, 64),
                      activations_in_layers = c("relu", "relu", "relu", "linear"),
                      domain_range = list(c(850, 1050), c(850, 1050), c(850, 1050)),
                      epochs = 300,
                      learn_rate = 0.002)

# Prediction example can be seen with ?fnn.fit()

# Functional Response Example:

# libraries
library(fda)

# Loading data
data("daily")

# Creating functional data
temp_data = array(dim = c(65, 35, 1))
tempbasis65  = create.fourier.basis(c(0,365), 65)
tempbasis7 = create.bspline.basis(c(0,365), 7, norder = 4)
timepts = seq(1, 365, 1)
temp_fd = Data2fd(timepts, daily$tempav, tempbasis65)
prec_fd = Data2fd(timepts, daily$precav, tempbasis7)
prec_fd$coefs = scale(prec_fd$coefs)

# Data set up
temp_data[,,1] = temp_fd$coefs
resp_mat = prec_fd$coefs

# Non functional covariate
weather_scalar = data.frame(total_prec = apply(daily$precav, 2, sum))

# Getting data into proper format
ind = 1:30
nbasis = 65
weather_data_train <- array(dim = c(nbasis, ncol(temp_data), 1))
weather_data_train[,,1] = temp_data
scalar_train = data.frame(weather_scalar[,1])
resp_train = t(resp_mat)

# Running model
weather_func_fnn <- fnn.fit(resp = resp_train,
                            func_cov = weather_data_train,
                            scalar_cov = scalar_train,
                            basis_choice = c("bspline"),
                            num_basis = c(7),
                            hidden_layers = 2,
                            neurons_per_layer = c(1024, 1024),
                            activations_in_layers = c("sigmoid", "linear"),
                            domain_range = list(c(1, 365)),
                            epochs = 300,
                            learn_rate = 0.01,
                            func_resp_method = 1)




[Package FuncNN version 1.0 Index]