R: Best-fitting Regression Model Based on Powers and...

bestModel {cNORM}

R Documentation

Best-fitting Regression Model Based on Powers and Interactions

Description

Computes and selects the best-fitting regression model by evaluating a series of models with increasing predictors. It aims to find a parsimonious model that effectively captures the variance in the data. This can be useful in psychometric test construction to smooth out data and reduce noise while retaining key diagnostic information. Model selection can be based on the number of terms or the explained variance (R^2). Setting high values for the number of terms, R^2 cutoff, or 'k' may lead to overfitting. Typical recommended starting points are 'terms = 5', 'R^2 = .99', and 'k = 4'.

Usage

bestModel(
  data,
  raw = NULL,
  R2 = NULL,
  k = NULL,
  t = NULL,
  predictors = NULL,
  terms = 0,
  weights = NULL,
  force.in = NULL,
  plot = TRUE
)

Arguments

`data`	Preprocessed dataset with 'raw' scores, powers, interactions, and usually an explanatory variable (like age).
`raw`	Name of the raw score variable (default: 'raw').
`R2`	Adjusted R^2 stopping criterion for model building (default: 0.99).
`k`	Power constant influencing model complexity (default: 4, max: 6).
`t`	Age power parameter. If unset, defaults to 'k'.
`predictors`	List of predictors or regression formula for model selection. Overrides 'k' and can include additional variables.
`terms`	Desired number of terms in the model.
`weights`	Optional case weights. If set to FALSE, default weights (if any) are ignored.
`force.in`	Variables forcibly included in the regression.
`plot`	If TRUE (default), displays a percentile plot of the model.

Details

Additional functions like plotSubset(model) and cnorm.cv can aid in model evaluation.

Value

The model meeting the R^2 criteria. Further exploration can be done using plotSubset(model) and plotPercentiles(data, model).

Examples

## Not run: 
# Example with sample data
normData <- prepareData(elfe)
model <- bestModel(normData)
plotSubset(model)
plotPercentiles(normData, model)

# Specifying variables explicitly
preselectedModel <- bestModel(normData, predictors = c("L1", "L3", "L1A3", "A2", "A3"))
print(regressionFunction(preselectedModel))

# Modeling based on the CDC data
bmi.data <- prepareData(CDC, raw = "bmi", group = "group", age = "age")
bmi.model <- bestModel(bmi.data, raw = "bmi")
printSubset(bmi.model)

# Using a precomputed model formula for gender-specific models
bmi.model.boys <- bestModel(bmi.data[bmi.data$sex == 1, ], predictors = bmi.model$terms)
bmi.model.girls <- bestModel(bmi.data[bmi.data$sex == 2, ], predictors = bmi.model$terms)

# Using a custom list of predictors and incorporating the 'sex' variable
bmi.sex <- bestModel(bmi.data, raw = "bmi", predictors = c(
  "L1", "L3", "A3", "L1A1", "L1A2", "L1A3", "L2A1", "L2A2",
  "L2A3", "L3A1", "L3A2", "L3A3", "sex", force.in = c("sex"))

## End(Not run)

[Package cNORM version 3.1.0 Index]