generate.data {PACLasso}R Documentation

Function to Randomly Generate Data (with Constraints)

Description

This function is primarily used for reproducibility. It will generate a data set of a given size with a given number of constraints for testing function code.

Usage

generate.data(n = 1000, p = 10, m = 5, cov.mat = NULL, s = 5,
  sigma = 1, glasso = F, err = 0)

Arguments

n

number of rows in randomly-generated data set (default is 1000)

p

number of variables in randomly-generated data set (default is 10)

m

number of constraints in randomly-generated constraint matrix (default is 5)

cov.mat

a covariance matrix applied in the generation of data to impose a correlation structure. Default is NULL (no correlation)

s

number of true non-zero elements in coefficient vector beta1 (default is 5)

sigma

standard deviation of noise in response (default is 1, indicating standard normal)

glasso

should the generalized Lasso be used (TRUE) or standard Lasso (FALSE). Default is FALSE

err

error to be introduced in random generation of coefficient values. Default is no error (err = 0)

Value

x generated x data

y generated response y vector

C.full generated full constraint matrix (with constraints of the form C.full*beta=b)

b generated constraint vector b

b.run if error was included, the error-adjusted value of b

beta the complete beta vector, including generated beta1 and beta2

References

Gareth M. James, Courtney Paulson, and Paat Rusmevichientong (JASA, 2019) "Penalized and Constrained Optimization." (Full text available at http://www-bcf.usc.edu/~gareth/research/PAC.pdf)

Examples

random_data = generate.data(n = 500, p = 20, m = 10)
dim(random_data$x)
head(random_data$y)
dim(random_data$C.full)
random_data$beta

[Package PACLasso version 1.0.0 Index]