error_loop {SimMultiCorrData} | R Documentation |
Error Loop to Correct Final Correlation of Simulated Variables
Description
This function corrects the final correlation of simulated variables to be within a precision value (epsilon
) of the
target correlation. It updates the pairwise intermediate MVN correlation iteratively in a loop until either the maximum error
is less than epsilon or the number of iterations exceeds the maximum number set by the user (maxit
). It uses
error_vars
to simulate all variables and calculate the correlation of all variables in each
iteration. This function would not ordinarily be called directly by the user. The function is a
modification of Barbiero & Ferrari's ordcont
function in GenOrd-package
.
The ordcont
has been modified in the following ways:
1) It works for continuous, ordinal (r >= 2 categories), and count variables.
2) The initial correlation check has been removed because this intermediate correlation
Sigma from rcorrvar
or rcorrvar2
has already been
checked for positive-definiteness and used to generate variables.
3) Eigenvalue decomposition is done on Sigma
to impose the correct interemdiate correlations on the normal variables.
If Sigma
is not positive-definite, the negative eigen values are replaced with 0.
4) The final positive-definite check has been removed.
5) The intermediate correlation update function was changed to accommodate more situations.
6) A final "fail-safe" check was added at the end of the iteration loop where if the absolute
error between the final and target pairwise correlation is still > 0.1, the intermediate correlation is set
equal to the target correlation (if extra_correct
= "TRUE").
7) Allowing specifications for the sample size and the seed for reproducibility.
Usage
error_loop(k_cat, k_cont, k_pois, k_nb, Y_cat, Y, Yb, Y_pois, Y_nb, marginal,
support, method, means, vars, constants, lam, size, prob, mu, n, seed,
epsilon, maxit, rho0, Sigma, rho_calc, extra_correct)
Arguments
k_cat |
the number of ordinal (r >= 2 categories) variables |
k_cont |
the number of continuous variables |
k_pois |
the number of Poisson variables |
k_nb |
the number of Negative Binomial variables |
Y_cat |
|
Y |
the continuous (mean 0, variance 1) variables |
Yb |
the continuous variables with desired mean and variance |
Y_pois |
the Poisson variables |
Y_nb |
the Negative Binomial variables |
marginal |
a list of length equal |
support |
a list of length equal |
method |
the method used to generate the continuous variables. "Fleishman" uses a third-order polynomial transformation and "Polynomial" uses Headrick's fifth-order transformation. |
means |
a vector of means for the continuous variables |
vars |
a vector of variances |
constants |
a matrix with |
lam |
a vector of lambda (> 0) constants for the Poisson variables (see |
size |
a vector of size parameters for the Negative Binomial variables (see |
prob |
a vector of success probability parameters |
mu |
a vector of mean parameters (*Note: either |
n |
the sample size |
seed |
the seed value for random number generation |
epsilon |
the maximum acceptable error between the final and target correlation matrices; smaller epsilons take more time |
maxit |
the maximum number of iterations to use to find the intermediate correlation; the
correction loop stops when either the iteration number passes |
rho0 |
the target correlation matrix |
Sigma |
the intermediate correlation matrix previously used in |
rho_calc |
the final correlation matrix calculated in |
extra_correct |
if "TRUE", a final "fail-safe" check is used at the end of the iteration loop where if the absolute error between the final and target pairwise correlation is still > 0.1, the intermediate correlation is set equal to the target correlation |
Value
A list with the following components:
Sigma
the intermediate MVN correlation matrix resulting from the error loop
rho_calc
the calculated final correlation matrix generated from Sigma
Y_cat
the ordinal variables
Y
the continuous (mean 0, variance 1) variables
Yb
the continuous variables with desired mean and variance
Y_pois
the Poisson variables
Y_nb
the Negative Binomial variables
niter
a matrix containing the number of iterations required for each variable pair
References
Barbiero A, Ferrari PA (2015). GenOrd: Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions. R package version 1.4.0. https://CRAN.R-project.org/package=GenOrd
Ferrari PA, Barbiero A (2012). Simulating ordinal data. Multivariate Behavioral Research, 47(4): 566-589. doi: 10.1080/00273171.2012.692630.
Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi: 10.1007/BF02293811.
Headrick TC (2002). Fast Fifth-order Polynomial Transforms for Generating Univariate and Multivariate Non-normal Distributions. Computational Statistics & Data Analysis, 40(4):685-711. doi: 10.1016/S0167-9473(02)00072-5. (ScienceDirect)
Headrick TC (2004). On Polynomial Transformations for Simulating Multivariate Nonnormal Distributions. Journal of Modern Applied Statistical Methods, 3(1), 65-71. doi: 10.22237/jmasm/1083370080.
Headrick TC, Kowalchuk RK (2007). The Power Method Transformation: Its Probability Density Function, Distribution Function, and Its Further Use for Fitting Data. Journal of Statistical Computation and Simulation, 77, 229-249. doi: 10.1080/10629360600605065.
Headrick TC, Sawilowsky SS (1999). Simulating Correlated Non-normal Distributions: Extending the Fleishman Power Method. Psychometrika, 64, 25-35. doi: 10.1007/BF02294317.
Headrick TC, Sheng Y, & Hodis FA (2007). Numerical Computing and Graphics for the Power Method Transformation Using Mathematica. Journal of Statistical Software, 19(3), 1 - 17. doi: 10.18637/jss.v019.i03.
Higham N (2002). Computing the nearest correlation matrix - a problem from finance; IMA Journal of Numerical Analysis 22: 329-343.
See Also
ordcont
, rcorrvar
, rcorrvar2
,
findintercorr
, findintercorr2