igate.regressions {igate}R Documentation

Produces the regression plots for sanity check in iGATE

Description

This function takes a data frame, a target variable and a list of ssv and produces a regression plot of each ssv against the target. The output can written as .png file into the current working directory. Also, summary statistics are provided.

Usage

igate.regressions(df, target, ssv = NULL,
  outlier_removal_target = TRUE, outlier_removal_ssv = TRUE,
  savePlots = FALSE, image_directory = tempdir())

Arguments

df

Data frame to be analysed.

target

Target varaible to be analysed.

ssv

A vector of suspected sources of variation. These are the variables in df which we believe might have an influence on the target variable and will be tested. If no list of ssv is provided, the test will be performed on all numeric variables.

outlier_removal_target

Logical. Should outliers (with respect to the target variable) be removed from df (default: TRUE)? Important: This only makes sense if no prior outlier removal has been performed on df, i.e. df still contains all the data. Otherwise calculation for outlier threshold will be falsified.

outlier_removal_ssv

Logical. Should outlier removal be performed for each ssv (default: TRUE)?

savePlots

Logical. If FALSE (the default) regression plots will be output to the standard plotting device. If TRUE, regression plots will additionally be saved to image_directory as png files.

image_directory

Directory to which plots should be saved. This is only used if savePlots = TRUE and defaults to the temporary directory of the current R session, i.e. tempdir(). To save plots to the current working directory set savePlots = TRUE and image_directory = getwd().

Details

Regression plots for each ssv against target are produced and svaed to current working directory. Also a data frame with summary statistics is produced, see Value for details.

Value

The regression plots of target against each ssv are written as .png file into the current working directory. Also, a data frame with the following columns is output

Causes The ssv that were analysed.
outliers_removed How many outliers (with respect to this ssv) have been removed before fitting the linear model?
observations_retained After outlier removal was performed, how many observations were left and used to fit the model?
regression_plot Logical. Was fitting the model successful? It can fail, for example, if a ssv is constant.
r_squared r^2 value of model.
gradient, intercept Gradient and intercept of fitted model.

Examples

igate.regressions(iris, target = "Sepal.Length")


[Package igate version 0.3.3 Index]