search.sur {ldt}R Documentation

Create a Model Set for SUR Models

Description

Use this function to create a Seemingly Unrelated Regression model set and search for the best models (and other information) based on in-sample and out-of-sample evaluation metrics.

Usage

search.sur(
  data = get.data(),
  combinations = get.combinations(),
  metrics = get.search.metrics(),
  modelChecks = get.search.modelchecks(),
  items = get.search.items(),
  options = get.search.options(),
  searchSigMaxIter = 0,
  searchSigMaxProb = 0.1
)

Arguments

data

A list that determines data and other required information for the search process. Use get.data() function to generate it from a matrix or a data.frame.

combinations

A list that determines the combinations of endogenous and exogenous variables in the search process. Use get.combinations() function to define it.

metrics

A list of options for measuring performance. Use get.search.metrics function to get them.

modelChecks

A list of options for excluding a subset of the model set. Use get.search.modelchecks function to get them.

items

A list of options for specifying the purpose of the search. Use get.search.items function to get them.

options

A list of extra options for performing the search. Use get.search.options function to get them.

searchSigMaxIter

Maximum number of iterations in searching for significant coefficients. Use 0 to disable the search.

searchSigMaxProb

Maximum value of type I error to be used in searching for significant coefficients. If p-value is less than this, it is interpreted as significant.

Value

A nested list with the following members:

counts

Information about the expected number of models, number of estimated models, failed estimations, and some details about the failures.

results

A data frame with requested information in items list.

info

The arguments and some general information about the search process such as the elapsed time.

Note that the output does not contain any estimation results, but minimum required data to estimate the models (Use summary() function to get the estimation).

See Also

estim.sur

Examples

num_y <- 2L # number of equations
num_x_r <- 3L # number of relevant explanatory variables
num_x_ir <-
  10 # (relatively large) number of irrelevant explanatory variables
num_obs = 100  # number of observations

# create random data
sample <- sim.sur(sigma = num_y, coef = num_x_r, nObs = num_obs)
x_ir <- matrix(rnorm(num_obs * num_x_ir), ncol = num_x_ir) # irrelevant data

# prepare data for estimation
data <- data.frame(sample$y, sample$x, x_ir)
colnames(data) <- c(colnames(sample$y), colnames(sample$x), paste0("z", 1:num_x_ir))

# Use systemfit to estimate and analyse:
exp_names <- paste0(colnames(data)[(num_y + 1):(length(colnames((data))))], collapse = " + ")
fmla <- lapply(1:num_y, function(i) as.formula(paste0("Y", i, " ~ -1 + ", exp_names)))
fit <- systemfit::systemfit(fmla, data = data, method = "SUR")
summary(fit)

# You can also use this package estimation function:
fit <- estim.sur(data = get.data(data, endogenous = num_y, addIntercept = FALSE))
print(fit)

# Alternatively, You can define an SUR model set:
x_sizes = c(1:3) # assuming we know the number of relevant explanatory variables is less than 3
num_targets = 2
metric_options <- get.search.metrics(typesIn = c("sic")) # We use SIC for searching
search_res <- search.sur(data = get.data(data, endogenous = num_y, addIntercept = FALSE),
                         combinations = get.combinations(numTargets = num_targets,
                                                         sizes = x_sizes,
                                                         innerGroups = list(c(1), c(2))),
                         metrics = metric_options)
print(search_res)

# Use summary function to estimate the best models:
search_sum <- summary(search_res)

# Print the best model:
print(search_sum$results[[2]]$value)
#   see 'estim.sur' function

# Using a step-wise search to build a larger model set:
x_sizes_steps = list(c(1, 2, 3), c(4))
counts_steps = c(NA, 7)
search_step_res <- search.sur(data = get.data(data, endogenous = num_y, addIntercept = FALSE),
                              combinations = get.combinations(numTargets = num_targets,
                                                              sizes = x_sizes_steps,
                                                              stepsNumVariables = counts_steps,
                                                              innerGroups = list(c(1,2))),
                              metrics = metric_options)
# combinations argument is different

print(search_step_res)


[Package ldt version 0.5.2 Index]