R: Fit the MultiNeSS model

multiness_fit {multiness}

R Documentation

Fit the MultiNeSS model

Description

multiness_fit fits the Gaussian or logistic MultiNeSS model with various options for parameter tuning.

Usage

multiness_fit(A,model,self_loops,refit,tuning,tuning_opts,optim_opts)

Arguments

`A`	An `n \times n \times m` array containing edge entries for an undirected multiplex network on `n` nodes and `m` layers.
`model`	A string which provides choice of model, either `'gaussian'` or `'logistic'`. Defaults to `'gaussian'`.
`self_loops`	A Boolean, if `FALSE`, all diagonal entries are ignored in optimization. Defaults to `TRUE`.
`refit`	A Boolean, if `TRUE`, a refitting step is performed to debias the eigenvalues of the estimates. Defaults to `TRUE`.
`tuning`	A string which provides the tuning method, valid options are `'fixed'`, `'adaptive'`, or `'cv'`. Defaults to `'adaptive'`.
`tuning_opts`	A list, containing additional optional arguments controlling parameter tuning. The arguments used depends on the choice of tuning method. If `tuning='fixed'`, `multiness_fit` will utilize the following arguments: lambda A positive scalar, the `\lambda` parameter in the nuclear norm penalty, see Details. Defaults to `2.309 * sqrt(n*m)`. alpha A positive scalar or numeric vector of length `m`, the parameters `\alpha_k` in the nuclear norm penalty, see Details. If a scalar is provided all `\alpha_k` parameters are set to that value. Defaults to `1/sqrt(m)` If `tuning='adaptive'`, `multiness_fit` will utilize the following arguments: layer_wise A Boolean, if `TRUE`, the entry-wise variance is estimated individually for each layer. Otherwise the estimates are pooled. Defaults to `TRUE`. penalty_const A positive scalar `C` which scales the penalty parameters (see Details). Defaults to `2.309`. penalty_const_lambda A positive scalar `c` which scales only the `\lambda` penalty parameter (see Details). Defaults to `1`. If `tuning='cv'`, `multiness_fit` will utilize the following arguments: layer_wise A Boolean, if `TRUE`, the entry-wise variance is estimated individually for each layer. Otherwise the estimates are pooled. Defaults to `TRUE`. N_cv A positive integer, the number of repetitions of edge cross-validation performed for each parameter setting. Defaults to `3`. p_cv A positive scalar in the interval (0,1), the proportion of edge entries held out in edge cross-validation. Defaults to `0.1`. penalty_const_lambda A positive scalar `c` which scales only the `\lambda` penalty parameter (see Details). Defaults to `1`. penalty_const_vec A numeric vector with positive entries, the candidate values of constant `C` to scale the penalty parameters (see Details). An optimal constant is chosen by edge cross-validation. Defaults to `c(1,1.5,...,3.5,4)`. refit_cv A Boolean, if `TRUE`, a refitting step is performed when fitting the model for edge cross-validation. Defaults to `TRUE` verbose_cv A Boolean, if `TRUE`, console output will provide updates on the progress of edge cross-validation. Defaults to `FALSE`.
`optim_opts`	A list, containing additional optional arguments controlling the proximal gradient descent algorithm. check_obj A Boolean, if `TRUE`, convergence is determined by checking the decrease in the objective. Otherwise it is determined by checking the average entry-wise difference in consecutive values of `F`. Defaults to `TRUE`. eig_maxitr A positive integer, maximum iterations for internal eigenvalue solver. Defaults to `1000`. eig_prec A positive scalar, estimated eigenvalues below this threshold are set to zero. Defaults to `1e-2`. eps A positive scalar, convergence threshold for proximal gradient descent. Defaults to `1e-6`. eta A positive scalar, step size for proximal gradient descent. Defaults to `1` for the Gaussian model, `5` for the logistic model. init A string, initialization method. Valid options are `'fix'` (using initializers `optim_opts$V_init` and `optim_opts$U_init`), `'zero'` (initialize all parameters at zero), or `'svd'` (initialize with a truncated SVD with rank `optim_opts$init_rank`). Defaults to `'zero'`. K_max A positive integer, maximum iterations for proximal gradient descent. Defaults to `100`. max_rank A positive integer, maximum rank for internal eigenvalue solver. Defaults to `sqrt(n)`. missing_pattern An `n \times n \times m` Boolean array with `TRUE` for each observed entry and `FALSE` for missing entries. If unspecified, it is set to `!is.na(A)`. positive A Boolean, if `TRUE`, singular value thresholding only retains positive eigenvalues. Defaults to `FALSE`. return_posns A Boolean, if `TRUE`, returns estimates of the latent positions based on ASE. Defaults to `FALSE`. verbose A Boolean, if `TRUE`, console output will provide updates on the progress of proximal gradient descent. Defaults to `FALSE`.

Details

A MultiNeSS model is fit to an n \times n \times m array A of symmetric adjacency matrices on a common set of nodes. Fitting proceeds by convex proximal gradient descent on the entries of F = VV^{T} and G_k = U_kU_k^{T}, see MacDonald et al., (2020), Section 3.2. Additional optional arguments for the gradient descent routine can be provided in optim_opts. refit provides an option to perform an additional refitting step to debias the eigenvalues of the estimates, see MacDonald et al., (2020), Section 3.3.

By default, multiness_fit will return estimates of the matrices F and G_k. optim_opts$return_posns provides an option to instead return estimates of latent positions V and U_k based on the adjacency spectral embedding (if such a factorization exists).

Tuning parameters \lambda and \alpha_k in the nuclear norm penalty

\lambda ||F||_* + \sum_k \lambda \alpha_k ||G_k||_*

are either set by the user (tuning='fixed'), selected adaptively using a robust estimator of the entry-wise variance (tuning='adaptive'), or selected using edge cross-validation (tuning='cv'). For more details see MacDonald et al., (2020), Section 3.4. Additional optional arguments for parameter tuning can be provided in tuning_opts.

Value

A list is returned with the MultiNeSS model estimates, dimensions of the common and individual latent spaces, and some additional optimization output:

`F_hat`	An `n \times n` matrix estimating the common part of the expected adjacency matrix, `F = VV^{T}`. If `optim_opts$return_posns` is `TRUE`, this is not returned.
`G_hat`	A list of length `m`, the collection of `n \times n` matrices estimating the individual part of each adjacency matrix, `G_k = U_kU_k^{T}`. If `optim_opts$return_posns` is `TRUE`, this is not returned.
`V_hat`	A matrix estimating the common latent positions. Returned if `optim_opts$return_posns` is `TRUE`.
`U_hat`	A list of length `m`, the collection of matrices estimating the individual latent positions. Returned if `optim_opts$return_posns` is `TRUE`.
`d1`	A non-negative integer, the estimated common dimension of the latent space.
`d2`	An integer vector of length `m`, the estimated individual dimension of the latent space for each layer.
`K`	A positive integer, the number of iterations run in proximal gradient descent.
`convergence`	An integer convergence code, `0` if proximal gradient descent converged in fewer than `optim_opts$K_max` iterations, `1` otherwise.
`lambda`	A positive scalar, the tuned `\lambda` penalty parameter (see Details).
`alpha`	A numeric vector of length `m`, the tuned `\alpha` penalty parameters (see Details).

Examples

# gaussian model data
data1 <- multiness_sim(n=100,m=4,d1=2,d2=2,
                     model="gaussian")

# multiness_fit with fixed tuning
fit1 <- multiness_fit(A=data1$A,
                      model="gaussian",
                      self_loops=TRUE,
                      refit=FALSE,
                      tuning="fixed",
                      tuning_opts=list(lambda=40,alpha=1/2),
                      optim_opts=list(max_rank=20,verbose=TRUE))

# multiness_fit with adaptive tuning
fit2 <- multiness_fit(A=data1$A,
                      refit=TRUE,
                      tuning="adaptive",
                      tuning_opts=list(layer_wise=FALSE),
                      optim_opts=list(return_posns=TRUE))

# logistic model data
data2 <- multiness_sim(n=100,m=4,d1=2,d2=2,
                       model="logistic",
                       self_loops=FALSE)

# multiness_fit with cv tuning
fit3 <- multiness_fit(A=data2$A,
                      model="logistic",
                      self_loops=FALSE,
                      tuning="cv",
                      tuning_opts=list(N_cv=2,
                                       penalty_const_vec=c(1,2,2.309,3),
                                       verbose_cv=TRUE))

[Package multiness version 1.0.2 Index]