R: Multivariate Least Squares with Equality/Inequality...

mlsei {CMLS}

R Documentation

Multivariate Least Squares with Equality/Inequality Constraints

Description

Finds the q x p matrix B that minimizes the multivariate least squares problem

sum(( Y - X %*% t(Z %*% B) )^2)

subject to t(A) %*% B[,j] >= b for all j = 1:p. Unique basis functions and constraints are allowed for each column of B.

Usage

mlsei(X, Y, Z, A, b, meq,
      backfit = FALSE, maxit = 1000, 
      eps = 1e-10, del = 1e-6,
      XtX = NULL, ZtZ = NULL, 
      simplify = TRUE, catchError = FALSE)

Arguments

`X`	Matrix of dimension `n` x `p`.
`Y`	Matrix of dimension `n` x `m`.
`Z`	Matrix of dimension `m` x `q`. Can also input a list (see Note). If missing, then `Z = diag(m)` so that `q = m`.
`A`	Constraint matrix of dimension `q` x `r`. Can also input a list (see Note). If missing, no constraints are imposed.
`b`	Consraint vector of dimension `r` x 1. Can also input a list (see Note). If missing, then `b = rep(0, r)`.
`meq`	The first `meq` columns of `A` are equality constraints, and the remaining `r - meq` are inequality constraints. Can also input a vector (see Note). If missing, then `meq = 0`.
`backfit`	Estimate `B` via back-fitting (`TRUE`) or vectorization (`FALSE`). See Details.
`maxit`	Maximum number of iterations for back-fitting algorithm. Ignored if `backfit = FALSE`.
`eps`	Convergence tolerance for back-fitting algorithm. Ignored if `backfit = FALSE`.
`del`	Stability tolerance for back-fitting algorithm. Ignored if `backfit = FALSE`.
`XtX`	Crossproduct matrix: `XtX = crossprod(X)`.
`ZtZ`	Crossproduct matrix: `ZtZ = crossprod(Z)`.
`simplify`	If `Z` is a list, should `B` be returned as a matrix (if possible)? See Note.
`catchError`	If `catchError = FASLE`, an error induced by `solve.QP` will be returned. Otherwise `tryCatch` will be used in attempt to catch the error.

Details

If backfit = FALSE (default), a closed-form solution is used to estimate B whenever possible. Otherwise a back-fitting algorithm is used, where the columns of B are updated sequentially until convergence. The backfitting algorithm is determined to have converged when

mean((B.new - B.old)^2) < eps * (mean(B.old^2) + del),

where B.old and B.new denote the parameter estimates at iterations t and t+1 of the backfitting algorithm.

Value

If Z is a list with q_j = q for all j = 1,\ldots,p, then...

`B`	is returned as a `q` x `p` matrix when `simplify = TRUE`
`B`	is returned as a list of length `p` when `simplify = FALSE`

If Z is a list with q_j \neq q for some j, then B is returned as a list of length p.

Otherwise B is returned as a q x p matrix.

Note

The Z input can also be a list of length p where Z[[j]] contains a m x q_j matrix. If q_j = q for all j = 1,\ldots,p and simplify = TRUE, the output B will be a matrix. Otherwise B will be a list of length p where B[[j]] contains a q_j x 1 vector.

The A and b inputs can also be lists of length p where t(A[[j]]) %*% B[,j] >= b[[j]] for all j = 1,\ldots,p. If A and b are lists of length p, the meq input should be a vector of length p indicating the number of equality constraints for each element of A.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Goldfarb, D., & Idnani, A. (1983). A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27, 1-33. doi:10.1007/BF02591962

Helwig, N. E. (in prep). Constrained multivariate least squares in R.

Ten Berge, J. M. F. (1993). Least Squares Optimization in Multivariate Analysis. Volume 25 of M & T Series. DSWO Press, Leiden University. ISBN: 9789066950832

Turlach, B. A., & Weingessel, A. (2019). quadprog: Functions to solve Quadratic Programming Problems. R package version 1.5-8. https://CRAN.R-project.org/package=quadprog

Examples

######***######   GENERATE DATA   ######***######

# make X
set.seed(2)
n <- 50
m <- 20
p <- 2
Xmat <- matrix(rnorm(n*p), nrow = n, ncol = p)

# make B (which satisfies all constraints except monotonicity)
x <- seq(0, 1, length.out = m)
Bmat <- rbind(sin(2*pi*x), sin(2*pi*x+pi)) / sqrt(4.75)
struc <- rbind(rep(c(TRUE, FALSE), each = m / 2),
               rep(c(FALSE, TRUE), each = m / 2))
Bmat <- Bmat * struc

# make noisy data
set.seed(1)
Ymat <- Xmat %*% Bmat + rnorm(n*m, sd = 0.25)


######***######   UNCONSTRAINED   ######***######

# unconstrained
Bhat.cmls <- cmls(X = Xmat, Y = Ymat, const = "uncons")
Bhat.mlsei <- t(mlsei(X = Xmat, Y = Ymat))
mean((Bhat.cmls - Bhat.mlsei)^2)

# unconstrained and structured (note: cmls is more efficient)
Bhat.cmls <- cmls(X = Xmat, Y = Ymat, const = "uncons", struc = struc)
Amat <- vector("list", p)
meq <- rep(0, p)
for(j in 1:p){
   meq[j] <- sum(!struc[j,])
   if(meq[j] > 0){
      A <- matrix(0, nrow = m, ncol = meq[j])
      A[!struc[j,],] <- diag(meq[j])
      Amat[[j]] <- A
   } else {
      Amat[[j]] <- matrix(0, nrow = m, ncol = 1)
   }
}
Bhat.mlsei <- t(mlsei(X = Xmat, Y = Ymat, A = Amat, meq = meq))
mean((Bhat.cmls - Bhat.mlsei)^2)


######***######   NON-NEGATIVITY   ######***######

# non-negative
Bhat.cmls <- cmls(X = Xmat, Y = Ymat, const = "nonneg")
Bhat.mlsei <- t(mlsei(X = Xmat, Y = Ymat, A = diag(m)))
mean((Bhat.cmls - Bhat.mlsei)^2)

# non-negative and structured (note: cmls is more efficient)
Bhat.cmls <- cmls(X = Xmat, Y = Ymat, const = "nonneg", struc = struc)
eye <- diag(m)
meq <- rep(0, p)
for(j in 1:p){
   meq[j] <- sum(!struc[j,])
   Amat[[j]] <- eye[,sort(struc[j,], index.return = TRUE)$ix]
}
Bhat.mlsei <- t(mlsei(X = Xmat, Y = Ymat, A = Amat, meq = meq))
mean((Bhat.cmls - Bhat.mlsei)^2)


# see internals of cmls.R for further examples

[Package CMLS version 1.0-1 Index]