split_data {promor}R Documentation

Split the data frame to create training and test data

Description

This function can be used to create balanced splits of the protein intensity data in a model_df object to create training and test data

Usage

split_data(model_df, train_size = 0.8, seed = NULL)

Arguments

model_df

A model_df object from performing pre_process.

train_size

The size of the training data set as a proportion of the complete data set. Default is 0.8.

seed

Numerical. Random number seed. Default is NULL

Details

This function splits the model_df object in to training and test data sets using random sampling while preserving the original class distribution of the data. Make sure to fix the random number seed with seed for reproducibility

Value

A list of data frames.

Author(s)

Chathurani Ranathunge

See Also

Examples


## Create a model_df object
covid_model_df <- pre_process(covid_fit_df, covid_norm_df)

## Split the data frame into training and test data sets using default settings
covid_split_df1 <- split_data(covid_model_df, seed = 8314)

## Split the data frame into training and test data sets with 70% of the
## data in training and 30% in test data sets
covid_split_df2 <- split_data(covid_model_df, train_size = 0.7, seed = 8314)

## Access training data set
covid_split_df1$training

## Access test data set
covid_split_df1$test


[Package promor version 0.2.1 Index]