HGMND {HGMND}R Documentation

Heterogeneous Graphical Model for Non-Negative Data

Description

The HGMND is the main function to estimate the conditional dependence matrices of variables from different datasets.

Usage

HGMND(x,
     setting,
     h,
     centered,
     mat.adj,
     lambda1,
     lambda2,
     gamma   = 1,
     maxit   = 200,
     tol     = 1e-5,
     silent  = TRUE)

Arguments

x

a list of data matrices sharing the same variables in their columns.

setting

a string that indicates the data distribution, must be chosen from "gaussian", "gamma", "exp".

h

the function h(x) used in the h-generalized score matching loss, which returns a list containing hx = h(x) and its derivative hpx = hp(x), where x is the data matrix. See details for more information.

centered

logical, if centered = TRUE, the data distribution is assumed centered with \eta = 0.

mat.adj

the adjacency matrix of the network among the multiple datasets, containing only 0s and 1s. Only the upper-triangle of mat.adj is used.

lambda1

the non-negative tuning parameter which controls the sparsity level of the estimation.

lambda2

the non-negative tuning parameter which controls the homogeneity level of the estimation.

gamma

the step size parameter in ADMM. Default to 1.

maxit

maximum number of iterations. Default to 200.

tol

tolerance in the convergence criterion. Default to 1e-5.

silent

logical, if silent = FALSE, the prime and dual feasibility and the time used in each ADMM iteration will show on the console.

Details

h can be generated by function get_h_hp in package genscore. See more details in Yu S., Lin, L. & Gilks, W. (2020). genscore: Generalized Score Matching Estimators. R package version 1.0.2. https://CRAN.R-project.org/package=genscore and Yu, S., Drton, M., & Shojaie, A. (2019). Generalized Score Matching for Non-Negative Data. J. Mach. Learn. Res., 20, 76-1.

Suppose we have M datasets, and we demand the network among them to be connected and have M - 1 edges, hence acyclic. This is sufficient for computational feasibility, which however does not prevent our method from being applicable to diverse network structures.

Value

The HGMND method returns the estimated conditional dependence matrix of each dataset.

Theta

the 3-dimensional array containing the estimation of the multiple conditional dependence matrices. The 3rd dimension represents different datasets.

M

an integer, the number of datasets.

P

an integer, dimension of the random vector of interest.

References

Yu, S., Drton, M., & Shojaie, A. (2019). Generalized Score Matching for Non-Negative Data. J. Mach. Learn. Res., 20, 76-1.

Yu S., Lin, L. & Gilks, W. (2020). genscore: Generalized Score Matching Estimators. R package version 1.0.2. https://CRAN.R-project.org/package=genscore.

Examples

# This is an example of HGMND with simulated data
data(HGMND_SimuData)
h              <- genscore::get_h_hp("mcp", 1, 5)
HGMND_SimuData <- lapply(HGMND_SimuData, function(x) scale(x, center = FALSE))
mat.chain      <- diag(length(HGMND_SimuData))
diag(mat.chain[-nrow(mat.chain), -1]) <- 1

result <- HGMND(x        = HGMND_SimuData,
                setting  = "gaussian",
                h        = h,
                centered = FALSE,
                mat.adj  = mat.chain,
                lambda1  = 0.086,
                lambda2  = 3.6,
                gamma    = 1,
                tol      = 1e-3,
                silent  = TRUE)
Theta       <- result[["Theta"]]

[Package HGMND version 0.1.0 Index]