splitratio {SPlit} | R Documentation |
Optimal splitting ratio
Description
splitratio()
finds the optimal splitting ratio by assuming a polynomial regression model with interactions can approximate the true model. The number of parameters in the model is estimated from the full data using stepwise regression. A simpler solution is to choose the number of parameters to be square root of the number of unique rows in the input matrix of the dataset. Please see Joseph (2022) for details.
Usage
splitratio(x, y, method = "simple", degree = 2)
Arguments
x |
Input matrix |
y |
Response (output variable) |
method |
This could be “simple” or “regression”. The default method “simple” uses the square root of the number of unique rows in |
degree |
This specifies the degree of the polynomial to be fitted, which is needed only if |
Value
Splitting ratio, which is the fraction of the dataset to be used for testing.
References
Joseph, V. R. (2022). Optimal Ratio for Data Splitting. Statistical Analysis & Data Mining: The ASA Data Science Journal, to appear.
Examples
X = rnorm(n=100, mean=0, sd=1)
Y = rnorm(n=100, mean=X^2, sd=1)
splitratio(x=X, y=Y)
splitratio(x=X, y=Y, method="regression")