| validate {igate} | R Documentation | 
Validates results after using igate or categorical.igate.
Description
Takes a new data frame to be used for validation and the causes and control bands
obtained from igate or categorical.igate and returns
all those observations that fall within these control bands.
Usage
validate(validation_df, target, causes, results_df, type = NULL)
Arguments
| validation_df | Data frame to be used for validation. It is recommended to use
a different data frame from the one used in  | 
| target | Target variable that was used in  | 
| causes | Causes determined by  | 
| results_df | The data frame containing the results of  | 
| type | The type of igate that was performed: either  | 
Details
If a value of Good_Count or Bad_count is very low in the second
data frame, it means that this cause is excluding a lot of observations from the
first data frame. Consider re-running validate with this cause removed from
causes.
Value
A list of three data frames is returned. The first data frame contains those observations
in validation_df that fall into *all* the good resp. bad control bands specified in results_df.
The columns are target, then one column for each of the causes and a new column
expected_quality which is "good" if the observation falls into all the good
control bands and "bad" if it falls into all the bad control bands.
The second data frame has three columns
| Cause | Each of the causes. | 
| Good_Count | If we selected all those observations that fall into the good band of this cause, how many observations would we select? | 
| Bad_Count | If we selected all those observations that fall into the bad band of this cause, how many observations would we select? | 
The third data frame summarizes the first data frame: If type = "continuous" it has
three columns:
| expected_quality | Either "good"or"bad". | 
| max_target | The maximum value for targetfor the observations with "good"
expected quality resp. "bad" expected quality. | 
| min_target | Minimum value of targetfor good resp. bad expected quality. | 
If type = "categorical" it has the following three columns:
| expected_quality | Either "good"or"bad". | 
| Category | A list of categories of the observations with expected quality good resp. bad. | 
| Frequency | A count how often the respective Categoryappears amongs the observations with
good/ bad expected quality. | 
Examples
validate(iris, target = "Sepal.Length", causes = resultsIris$Causes, results_df = resultsIris)