perform_regression {NIPTeR} | R Documentation |
Regression based Z score
Description
Make multiple models using linear regression and calculate Z-score
Usage
perform_regression(nipt_sample, nipt_control_group, chromo_focus,
n_models = 4, n_predictors = 4, exclude_chromosomes = NULL,
include_chromosomes = NULL, use_test_train_set = T,
size_of_train_set = 0.6, overdispersion_rate = 1.15,
force_practical_cv = F)
Arguments
nipt_sample |
The NIPTSample object that is the focus of the analysis |
nipt_control_group |
The NIPTControlGroup object used in the analysis |
chromo_focus |
The chromosome of interest. Most commonly chromosome 13, 18 or 21. However, every autosomal chromosome can be predicted |
n_models |
Integer Number of linear models to be made. Default setting is 4 models |
n_predictors |
Integer The number of predictors each model contains. Default is 4 |
exclude_chromosomes |
integer. Exclude which autosomal chromosomes as potential predictors? Default potential trisomic chromosomes 13, 18 and 21 are exluded. |
include_chromosomes |
integer. Include potential trisomic chromosomes? Options are: chromosomes 13, 18 and 21 |
use_test_train_set |
Use a test and train set to build the models? Default is TRUE |
size_of_train_set |
The size of the train set expressed in a decimal. Default is 0.6 (60 of the control samples) |
overdispersion_rate |
The standard error of the mean is multiplied by this factor |
force_practical_cv |
Boolean, Ignore the theoretical CV and always use the practical CV? |
Details
The regression based Z-score builds n models with m predictors using stepwise regression with forward selection. The models are used to predict the chromosomal fraction of interest, for the sample and for the control group. The observed fractions are then divided by the expected fraction, and Z-scores are calculated over these ratios. The Z-score is calculated by subtracting one from the ratio of the sample and dividing this result by the coefficient of variation. The coefficient of variation (CV) can either be the Practical or Theoretical CV. The Theoretical CV is the standard error multiplied by the overdispersion. Theoretically, the CV cannot be lower than the standard error of the mean. If it is case the CV is lower than Theoretical CV, then the Theoretical CV is used.
The output of this function is an object of type RegressionResult, a named list containing:
-
prediction_statistics A dataframe with 7 rows and a column for every model. The rows are:
-
Z_score_sample The regression based Z score for the model
-
CV The coefficient of varation for the model
-
cv_types The CV type used to calculate the regression based Z score for the model. Either Practical_CV or Theoretical_CV
-
P_value_shapiro The P value of the Shaipro-Wilk test for normality of the control group regression based Z scores for the model
-
Predictor_chromosomes The predictor chromosomes used in the model
-
Mean_test_set The mean of the test set. Note that for calculating the regression based Z scores the mean is replaced by one. The mean, however, can be seen as a quality metric for the model
-
CV_train_set The CV of the train set. The difference between this CV and the CV of the test can be used as a measure to quantify overfit
-
-
control_group_Zscores A matrix containing the regression based Z-scores for the control sample
-
focus_chromosome he chromosome of interest. Most commonly chromosome 13, 18 or 21. However, every autosomal chromosome can be predicted
-
correction_status The correction status of the control group autosomes
-
control_group_sample_names The sample names of the test set group
-
models List of the summary.lm output for every model
-
potential_predictors The total pool of chromosomes where the predictors are selected from
-
all_control_group_Z_scores Z-scores for every sample using theoretical and practical VCs
-
additional_statistics Statistics for both the practical and theoretical CVs for every prediction set
Value
RegressionResult object
Examples
## Not run:
regression_score_21 <- perform_regression(nipt_sample = sample_of_interest,
nipt_control_group = control_group, chromo_focus = 21)
## End(Not run)