convexLogisticPCA {logisticPCA}R Documentation

Convex Logistic Principal Component Analysis

Description

Dimensionality reduction for binary data by extending Pearson's PCA formulation to minimize Binomial deviance. The convex relaxation to projection matrices, the Fantope, is used.

Usage

convexLogisticPCA(x, k = 2, m = 4, quiet = TRUE, partial_decomp = FALSE,
  max_iters = 1000, conv_criteria = 1e-06, random_start = FALSE, start_H,
  mu, main_effects = TRUE, ss_factor = 4, weights, M)

Arguments

x

matrix with all binary entries

k

number of principal components to return

m

value to approximate the saturated model

quiet

logical; whether the calculation should give feedback

partial_decomp

logical; if TRUE, the function uses the rARPACK package to quickly initialize H when ncol(x) is large and k is small

max_iters

number of maximum iterations

conv_criteria

convergence criteria. The difference between average deviance in successive iterations

random_start

logical; whether to randomly inititalize the parameters. If FALSE, function will use an eigen-decomposition as starting value

start_H

starting value for the Fantope matrix

mu

main effects vector. Only used if main_effects = TRUE

main_effects

logical; whether to include main effects in the model

ss_factor

step size multiplier. Amount by which to multiply the step size. Quadratic convergence rate can be proven for ss_factor = 1, but I have found higher values sometimes work better. The default is ss_factor = 4. If it is not converging, try ss_factor = 1.

weights

an optional matrix of the same size as the x with non-negative weights

M

depricated. Use m instead

Value

An S3 object of class clpca which is a list with the following components:

mu

the main effects

H

a rank k Fantope matrix

U

a ceiling(k)-dimentional orthonormal matrix with the loadings

PCs

the princial component scores

m

the parameter inputed

iters

number of iterations required for convergence

loss_trace

the trace of the average negative log likelihood using the Fantope matrix

proj_loss_trace

the trace of the average negative log likelihood using the projection matrix

prop_deviance_expl

the proportion of deviance explained by this model. If main_effects = TRUE, the null model is just the main effects, otherwise the null model estimates 0 for all natural parameters.

References

Landgraf, A.J. & Lee, Y., 2015. Dimensionality reduction for binary data through the projection of natural parameters. arXiv preprint arXiv:1510.06112.

Examples

# construct a low rank matrix in the logit scale
rows = 100
cols = 10
set.seed(1)
mat_logit = outer(rnorm(rows), rnorm(cols))

# generate a binary matrix
mat = (matrix(runif(rows * cols), rows, cols) <= inv.logit.mat(mat_logit)) * 1.0

# run convex logistic PCA on it
clpca = convexLogisticPCA(mat, k = 1, m = 4)

[Package logisticPCA version 0.2 Index]