ModelSelection.Phase {HCmodelSets} | R Documentation |
Construct sets of well-fitting models as proposed by Cox, D. R. & Battey, H. S. (2017)
Description
This function tests low dimensional subsests of the set of retained variables from the reduction phase and any squared or interaction terms suggested at the exploratory phase. Lists of well-fitting models of each dimension are returned.
Usage
ModelSelection.Phase(X,Y, list.reduction, family=gaussian,
signif=0.01, sq.terms=NULL, in.terms=NULL,
modelSize=NULL, Cox.Hazard = FALSE)
Arguments
X |
Design matrix. |
Y |
Response vector. |
list.reduction |
Indices of variables that where chosen at the reduction phase. |
family |
A description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function, a family function or the result of a call to a family function. See |
signif |
Significance level of the likelihood ratio test against the comprehensive model. The default is 0.01. |
sq.terms |
Indices of squared terms suggested at the exploratory phase (See |
in.terms |
Indices of pairs of variables suggested at the exploratory phase (See |
modelSize |
Maximum size of the models to be tested. Curently the maximum is 7. If not provided a default is used. |
Cox.Hazard |
If TRUE fits proportional hazards regression model. The family argument will be ignored if Cox.Hazard=TRUE. |
Value
goodModels |
List of models that are in the confidence set of size 1 to modelSize. An interaction term between, say, variables x_1 and x_2 is displayed as “x_1 * x_2”; a squared term in, say, variable x_1 is displayed as “x_1 ^2”. If an interaction term is present without the corresponding main effects, the main effects should be added. |
Acknowledgement
The work was supported by the UK Engineering and Physical Sciences Research Council under grant number EP/P002757/1.
Author(s)
Hoeltgebaum, H. H.
References
Cox, D. R. and Battey, H. S. (2017). Large numbers of explanatory variables, a semi-descriptive analysis. Proceedings of the National Academy of Sciences, 114(32), 8592-8595.
Battey, H. S. and Cox, D. R. (2018). Large numbers of explanatory variables: a probabilistic assessment. Proceedings of the Royal Society of London, A., 474(2215), 20170631.
Hoeltgebaum, H., & Battey, H. S. (2019). HCmodelSets: An R Package for Specifying Sets of Well-fitting Models in High Dimensions. The R Journal, 11(2), 370-379.
See Also
Reduction.Phase
, Exploratory.Phase
Examples
## Generates a random DGP
dgp = DGP(s=5, a=3, sigStrength=1, rho=0.9, n=100, intercept=5, noise=1,
var=1, d=1000, DGP.seed = 2018)
#Reduction Phase using only the first 70 observations
outcome.Reduction.Phase = Reduction.Phase(X=dgp$X[1:70,],Y=dgp$Y[1:70],
family=gaussian, seed.HC = 1012)
# Exploratory Phase using only the first 70 observations, choosing the variables which
# were selected at least two times in the third dimension reduction
idxs = outcome.Reduction.Phase$List.Selection$`Hypercube with dim 2`$numSelected1
outcome.Exploratory.Phase = Exploratory.Phase(X=dgp$X[1:70,],Y=dgp$Y[1:70],
list.reduction = idxs,
family=gaussian, signif=0.01)
# Model Selection Phase using only the remainer observations
sq.terms = outcome.Exploratory.Phase$mat.select.SQ
in.terms = outcome.Exploratory.Phase$mat.select.INTER
MS = ModelSelection.Phase(X=dgp$X[71:100,],Y=dgp$Y[71:100], list.reduction = idxs,
sq.terms = sq.terms,in.terms = in.terms, signif=0.01)