multiness_fit {multiness}R Documentation

Fit the MultiNeSS model

Description

multiness_fit fits the Gaussian or logistic MultiNeSS model with various options for parameter tuning.

Usage

multiness_fit(A,model,self_loops,refit,tuning,tuning_opts,optim_opts)

Arguments

A

An n \times n \times m array containing edge entries for an undirected multiplex network on n nodes and m layers.

model

A string which provides choice of model, either 'gaussian' or 'logistic'. Defaults to 'gaussian'.

self_loops

A Boolean, if FALSE, all diagonal entries are ignored in optimization. Defaults to TRUE.

refit

A Boolean, if TRUE, a refitting step is performed to debias the eigenvalues of the estimates. Defaults to TRUE.

tuning

A string which provides the tuning method, valid options are 'fixed', 'adaptive', or 'cv'. Defaults to 'adaptive'.

tuning_opts

A list, containing additional optional arguments controlling parameter tuning. The arguments used depends on the choice of tuning method. If tuning='fixed', multiness_fit will utilize the following arguments:

lambda

A positive scalar, the \lambda parameter in the nuclear norm penalty, see Details. Defaults to 2.309 * sqrt(n*m).

alpha

A positive scalar or numeric vector of length m, the parameters \alpha_k in the nuclear norm penalty, see Details. If a scalar is provided all \alpha_k parameters are set to that value. Defaults to 1/sqrt(m)

If tuning='adaptive', multiness_fit will utilize the following arguments:

layer_wise

A Boolean, if TRUE, the entry-wise variance is estimated individually for each layer. Otherwise the estimates are pooled. Defaults to TRUE.

penalty_const

A positive scalar C which scales the penalty parameters (see Details). Defaults to 2.309.

penalty_const_lambda

A positive scalar c which scales only the \lambda penalty parameter (see Details). Defaults to 1.

If tuning='cv', multiness_fit will utilize the following arguments:

layer_wise

A Boolean, if TRUE, the entry-wise variance is estimated individually for each layer. Otherwise the estimates are pooled. Defaults to TRUE.

N_cv

A positive integer, the number of repetitions of edge cross-validation performed for each parameter setting. Defaults to 3.

p_cv

A positive scalar in the interval (0,1), the proportion of edge entries held out in edge cross-validation. Defaults to 0.1.

penalty_const_lambda

A positive scalar c which scales only the \lambda penalty parameter (see Details). Defaults to 1.

penalty_const_vec

A numeric vector with positive entries, the candidate values of constant C to scale the penalty parameters (see Details). An optimal constant is chosen by edge cross-validation. Defaults to c(1,1.5,...,3.5,4).

refit_cv

A Boolean, if TRUE, a refitting step is performed when fitting the model for edge cross-validation. Defaults to TRUE

verbose_cv

A Boolean, if TRUE, console output will provide updates on the progress of edge cross-validation. Defaults to FALSE.

optim_opts

A list, containing additional optional arguments controlling the proximal gradient descent algorithm.

check_obj

A Boolean, if TRUE, convergence is determined by checking the decrease in the objective. Otherwise it is determined by checking the average entry-wise difference in consecutive values of F. Defaults to TRUE.

eig_maxitr

A positive integer, maximum iterations for internal eigenvalue solver. Defaults to 1000.

eig_prec

A positive scalar, estimated eigenvalues below this threshold are set to zero. Defaults to 1e-2.

eps

A positive scalar, convergence threshold for proximal gradient descent. Defaults to 1e-6.

eta

A positive scalar, step size for proximal gradient descent. Defaults to 1 for the Gaussian model, 5 for the logistic model.

init

A string, initialization method. Valid options are 'fix' (using initializers optim_opts$V_init and optim_opts$U_init), 'zero' (initialize all parameters at zero), or 'svd' (initialize with a truncated SVD with rank optim_opts$init_rank). Defaults to 'zero'.

K_max

A positive integer, maximum iterations for proximal gradient descent. Defaults to 100.

max_rank

A positive integer, maximum rank for internal eigenvalue solver. Defaults to sqrt(n).

missing_pattern

An n \times n \times m Boolean array with TRUE for each observed entry and FALSE for missing entries. If unspecified, it is set to !is.na(A).

positive

A Boolean, if TRUE, singular value thresholding only retains positive eigenvalues. Defaults to FALSE.

return_posns

A Boolean, if TRUE, returns estimates of the latent positions based on ASE. Defaults to FALSE.

verbose

A Boolean, if TRUE, console output will provide updates on the progress of proximal gradient descent. Defaults to FALSE.

Details

A MultiNeSS model is fit to an n \times n \times m array A of symmetric adjacency matrices on a common set of nodes. Fitting proceeds by convex proximal gradient descent on the entries of F = VV^{T} and G_k = U_kU_k^{T}, see MacDonald et al., (2020), Section 3.2. Additional optional arguments for the gradient descent routine can be provided in optim_opts. refit provides an option to perform an additional refitting step to debias the eigenvalues of the estimates, see MacDonald et al., (2020), Section 3.3.

By default, multiness_fit will return estimates of the matrices F and G_k. optim_opts$return_posns provides an option to instead return estimates of latent positions V and U_k based on the adjacency spectral embedding (if such a factorization exists).

Tuning parameters \lambda and \alpha_k in the nuclear norm penalty

\lambda ||F||_* + \sum_k \lambda \alpha_k ||G_k||_*

are either set by the user (tuning='fixed'), selected adaptively using a robust estimator of the entry-wise variance (tuning='adaptive'), or selected using edge cross-validation (tuning='cv'). For more details see MacDonald et al., (2020), Section 3.4. Additional optional arguments for parameter tuning can be provided in tuning_opts.

Value

A list is returned with the MultiNeSS model estimates, dimensions of the common and individual latent spaces, and some additional optimization output:

F_hat

An n \times n matrix estimating the common part of the expected adjacency matrix, F = VV^{T}. If optim_opts$return_posns is TRUE, this is not returned.

G_hat

A list of length m, the collection of n \times n matrices estimating the individual part of each adjacency matrix, G_k = U_kU_k^{T}. If optim_opts$return_posns is TRUE, this is not returned.

V_hat

A matrix estimating the common latent positions. Returned if optim_opts$return_posns is TRUE.

U_hat

A list of length m, the collection of matrices estimating the individual latent positions. Returned if optim_opts$return_posns is TRUE.

d1

A non-negative integer, the estimated common dimension of the latent space.

d2

An integer vector of length m, the estimated individual dimension of the latent space for each layer.

K

A positive integer, the number of iterations run in proximal gradient descent.

convergence

An integer convergence code, 0 if proximal gradient descent converged in fewer than optim_opts$K_max iterations, 1 otherwise.

lambda

A positive scalar, the tuned \lambda penalty parameter (see Details).

alpha

A numeric vector of length m, the tuned \alpha penalty parameters (see Details).

Examples

# gaussian model data
data1 <- multiness_sim(n=100,m=4,d1=2,d2=2,
                     model="gaussian")

# multiness_fit with fixed tuning
fit1 <- multiness_fit(A=data1$A,
                      model="gaussian",
                      self_loops=TRUE,
                      refit=FALSE,
                      tuning="fixed",
                      tuning_opts=list(lambda=40,alpha=1/2),
                      optim_opts=list(max_rank=20,verbose=TRUE))

# multiness_fit with adaptive tuning
fit2 <- multiness_fit(A=data1$A,
                      refit=TRUE,
                      tuning="adaptive",
                      tuning_opts=list(layer_wise=FALSE),
                      optim_opts=list(return_posns=TRUE))

# logistic model data
data2 <- multiness_sim(n=100,m=4,d1=2,d2=2,
                       model="logistic",
                       self_loops=FALSE)

# multiness_fit with cv tuning
fit3 <- multiness_fit(A=data2$A,
                      model="logistic",
                      self_loops=FALSE,
                      tuning="cv",
                      tuning_opts=list(N_cv=2,
                                       penalty_const_vec=c(1,2,2.309,3),
                                       verbose_cv=TRUE))


[Package multiness version 1.0.2 Index]