plnmfa {lnmCluster}R Documentation

Penalized Logistic Normal Multinomial factor analyzer algorithm

Description

Main function that can do PLNM factor analyzer and select the best model based on BIC, AIC or ICL.

Usage

plnmfa(W_count, range_G, range_Q, model, criteria, range_tuning, iter, X)

Arguments

W_count

The microbiome count matrix

range_G

All possible number of components. A vector.

range_Q

A specific number of latent dimension.

model

The covaraince structure you choose, there are 2 different models belongs to this family:UUU and GUU. You can choose more than 1 covarance structure to do model selection.

criteria

one of AIC, BIC or ICL. The best model is depends on the criteria you choose. The default is BIC

range_tuning

A range of tuning parameters specified, ranged from 0-1.

iter

Max iterations, default is 150.

X

The regression covariate matrix, which is generated by model.matrix.

Value

z_ig Estimated latent variable z

cluster Component labels

mu_g Estimated component mean

pi_g Estimated component proportion

B_g Estimated bicluster membership

D_g Estimated error covariance

COV Estimated component covariance

beta_g Estimated covariate coefficients

overall_loglik Complete log likelihood value for each iteration

ICL ICL value

BIC BIC value

AIC AIC value

all_fitted_model display all names of fitted models in a data.frame.

Examples

#'#generate toy data with n=100, K=5,
#set up parameters
n<-100
p<-5
mu1<-c(-2.8,-1.3,-1.6,-3.9,-2.6)
B1<-matrix(c(1,0,1,0,1,0,0,1,0,1),nrow = p, byrow=TRUE)
T1<-diag(c(2.9,0.5))
D1<-diag(c(0.52, 1.53, 0.56, 0.19, 1.32))
cov1<-B1%*%T1%*%t(B1)+D1
mu2<-c(1.5,-2.7,-1.1,-0.4,-1.4)
B2<-matrix(c(1,0,1,0,0,1,0,1,0,1),nrow = p, byrow=TRUE)
T2<-diag(c(0.2,0.003))
D2<-diag(c(0.01, 0.62, 0.45, 0.01, 0.37))
cov2<-B2%*%T2%*%t(B2)+D2

#generate normal distribution
library(mvtnorm)
simp<-rmultinom(n,1,c(0.6,0.4))
lab<-as.factor(apply(t(simp),1,which.max))
df<-matrix(0,nrow=n,ncol=p)
for (i in 1:n) {
 if(lab[i]==1){df[i,]<-rmvnorm(1,mu1,sigma = cov1)}
 else if(lab[i]==2){df[i,]<-rmvnorm(1,mu2,sigma = cov2)}
}
#apply inverse of additive log ratio and transform normal to count data
f_df<-cbind(df,0)
z<-exp(f_df)/rowSums(exp(f_df))
W_count<-matrix(0,nrow=n,ncol=p+1)
for (i in 1:n) {
 W_count[i,]<-rmultinom(1,runif(1,10000,20000),z[i,])
}

#if run one model let range_G, and range_tuning be an integer
#remember you can always overspecify Q, so we don't suggest to run models with a range of Q.
res<-plnmfa(W_count,2,2,model="UUU",range_tuning=0.6)

#if run model selection let any \code{range_} parameters be a vector.
res<-plnmfa(W_count,c(2:3),3,range_tuning=seq(0.5,0.8,by=0.1))







[Package lnmCluster version 0.3.1 Index]