fase_seq {fase}R Documentation

Functional adjacency spectral embedding (sequential algorithm)

Description

fase_seq fits a functional adjacency spectral embedding to snapshots of (undirected) functional network data, with each of the d latent dimensions fit sequentially. The latent processes are fit in a spline basis specified by the user, with additional options for ridge penalization.

Usage

fase_seq(A,d,self_loops,spline_design,lambda,optim_options,output_options)

Arguments

A

An n \times n \times m array containing the snapshots of the functional network.

d

A positive integer, the number of latent space dimensions of the functional embedding.

self_loops

A Boolean, if FALSE, all diagonal entries are ignored in optimization. Defaults to TRUE.

spline_design

A list, containing the spline design information. For fitting with a B-spline design (the default):

type

The string 'bs'.

q

A positive integer, the dimension of the B-spline basis.

x_vec

A vector, the snapshot evaluation indices for the data. Defaults to an equally spaced vector of length m from 0 to 1.

x_max

A scalar, the maximum of the index space. Defaults to max(spline_design$x_vec).

x_min

A scalar, the minimum of the index space. Defaults to min(spline_design$x_vec).

spline_matrix

An m \times q matrix, the B-spline basis evaluated at the snapshot indices. If not specified, it will be calculated internally.

ridge_mat

The m \times m matrix for the generalized ridge penalty. If lambda> 0, defaults to diag(m).

For fitting with a smoothing spline design:

type

The string 'ss'.

x_vec

A vector, the snapshot evaluation indices for the data. Defaults to an equally spaced vector of length m from 0 to 1.

x_max

A scalar, the maximum of the index space. Defaults to max(spline_design$x_vec).

x_min

A scalar, the minimum of the index space. Defaults to min(spline_design$x_vec).

spline_matrix

An m \times m matrix, the natural cubic spline basis evaluated at the snapshot indices. If not specified, it will be calculated internally.

ridge_mat

The m \times m matrix for the generalized ridge penalty. Defaults to the second derivatives of the natural cubic spline basis evaluated at the snapshot indices.

lambda

A positive scalar, the scale factor for the generalized ridge penalty (see Details). Defaults to 0.

optim_options

A list, containing additional optional arguments controlling the gradient descent algorithm.

eps

A positive scalar, the convergence threshold for gradient descent in terms of relative change in objective value. Defaults to 1e-5.

eta

A positive scalar, the step size for gradient descent. Defaults to 1/(n*m).

K_max

A positive integer, the maximum iterations for gradient descent. Defaults to 2e3.

verbose

A Boolean, if TRUE, console output will provide updates on the progress of gradient descent. Defaults to FALSE.

init_W

A 3-dimensional array containing initial basis coordinates for gradient descent. Dimension should be n \timesspline_design$q \times d for B-spline designs, and n \times m \times d for smoothing spline designs. If included, init_M, init_L and init_sigma are ignored.

init_sigma

A positive scalar, the estimated edge dispersion parameter to calibrate initialization. If not provided, it is either estimated using the robust method proposed by Gavish and Donoho (2014) for weighted edge networks, or set to a default value 0.5 for binary edge networks.

init_L

A positive integer, the number of contiguous groups used for initialization. Defaults to the floor of (2nm/\texttt{init\_sigma}^2)^{1/3}.

init_M

A positive integer, the number of snapshots averaged in each group for initialization. Defaults use all snapshots.

output_options

A list, containing additional optional arguments controlling the output of fase.

return_coords

A Boolean, if TRUE, the basis coordinates for each latent process component are also returned as an array. Defaults to FALSE.

return_ngcv

A Boolean, if TRUE and spline_design$type=='bs', the network generalized cross validation criterion is returned. Defaults to TRUE.

Details

Note that fase_seq is a wrapper for fase. When d=1, fase_seq coincides with fase.

fase_seq finds a functional adjacency spectral embedding of an n \times n \times m array A of symmetric adjacency matrices on a common set of nodes, where each n \times n slice is associated to a scalar index x_k for k=1,...,m. Embedding requires the specification of a latent space dimension d and spline design information (with the argument spline_design).

fase_seq can fit latent processes using either a cubic B-spline basis with equally spaced knots, or a natural cubic spline basis with a second derivative (generalized ridge) smoothing penalty: a smoothing spline. To fit with a B-spline design (spline_design$type = 'bs'), one must minimally provide a basis dimension q of at least 4 and at most m.

When fitting with a smoothing spline design, the generalized ridge penalty is scaled by \lambda/n, where \lambda is specified by the argument lambda. see MacDonald et al., (2022+), Appendix E for more details. lambda can also be used to introduce a ridge penalty on the basis coordinates when fitting with B-splines.

Fitting minimizes a least squares loss, using gradient descent (Algorithm 1) on the basis coordinates w_{i,r} of each component process

z_{i,r}(x) = w_{i,r}^{T}B(x).

Additional options for the fitting algorithm, including initialization, can be specified by the argument optim_options. For more details on the fitting and initialization algorithms, see MacDonald et al., (2022+), Section 3.

By default, fase_seq will return estimates of the latent processes evaluated at the snapshot indices as an n \times d \times m array, after performing a Procrustes alignment of the consecutive snapshots. This extra alignment step can be skipped. fase_seq will also return the spline design information used to fit the embedding, convergence information for gradient descent, and (if specified) the basis coordinates.

When fitting with B-splines, fase_seq can return a network generalized cross validation criterion, described in MacDonald et al., (2022+), Section 3.3. This criterion can be minimized to choose appropriate values for q and d.

Value

A list is returned with the functional adjacency spectral embedding, the spline design information, and some additional optimization output:

Z

An n \times d \times m array containing the latent process embedding evaluated at the indices in spline_design$x_vec.

W

For B-spline designs, an n \times q \times d array; or for smoothing spline designs, an n \times m \times d array of estimated basis coordinates. If output_options$return_coords is FALSE, this is not returned.

spline_design

A list, describing the spline design:

type

A string, either 'bs' or 'ss'.

q

A positive integer, the dimension of the B-spline basis. Only returned for B-spline designs.

x_vec

A vector, the snapshot evaluation indices for the data.

x_max

A scalar, the maximum of the index space.

x_min

A scalar, the minimum of the index space.

spline_matrix

For B-spline designs, an m \times q matrix; or for smoothing spline designs, an m \times m matrix, the basis evaluated at the snapshot indices.

ridge_matrix

An m \times m matrix used in the generalized ridge penalty. Only returned for lambda > 0.

ngcv

A scalar, the network generalized cross validation criterion (see Details). Only returned for B-spline designs and when output_options$return_ngcv is TRUE.

K

A positive integer, the number of iterations run in gradient descent.

converged

An integer convergence code, 1 if gradient descent converged in fewer than optim_options$K_max iterations, 0 otherwise.

Examples

# Gaussian edge data with sinusoidal latent processes
set.seed(1)
data <- gaussian_snapshot_ss(n=50,d=2,
                             x_vec=seq(0,1,length.out=50),
                             self_loops=FALSE,sigma_edge=4)


# fase fit with B-spline design
fit_bs <- fase_seq(data$A,d=2,self_loops=FALSE,
                   spline_design=list(type='bs',q=9,x_vec=data$spline_design$x_vec),
                   optim_options=list(eps=1e-4,K_max=40),
                   output_options=list(return_coords=TRUE))

# fase fit with smoothing spline design
fit_ss <- fase_seq(data$A,d=2,self_loops=FALSE,
                   spline_design=list(type='ss',x_vec=data$spline_design$x_vec),
                   lambda=.5,
                   optim_options=list(eta=1e-4,K_max=40,verbose=FALSE))

#NOTE: both models fit with small optim_options$K_max=40 for demonstration


[Package fase version 1.0.1 Index]