subsetData {MoTBFs}R Documentation

Dataset subsetting

Description

Collection of functions for subsetting a "data.frame" by rows or columns, and to create training and test partitions.

Usage

TrainingandTestData(data, percentage_test, discreteVariables = NULL)

newData(data, nameX, nameY)

splitdata(data, nameVariable, min, max)

Arguments

data

A dataset of class data.frame.

percentage_test

The proportion of data that goes to the test set (between 0 and 1).

discreteVariables

A character vector with the name of the discrete variables.

nameX

A character vector with the name of the child variable in the conditional method.

nameY

A character vector with the name of the parent variables in the conditional method.

nameVariable

A character vector with the name of the variable to be filtered.

min, max

Boundary values to filter out.

Value

TrainingandTestData() returns a list of 2 elements containing the train and test datasets. newData() and splitdata() return a subset of variables or observations, respectively.

Examples


## Dataset
X <- rnorm(1000)
Y <- rchisq(1000, df = 8)
Z <- rep(letters[1:10], times = 1000/10)
data <- data.frame(X = X, Y = Y, Z = Z)
data <- discreteVariables_as.character(dataset = data, discreteVariables ="Z")

## Training and Test Datasets
TT <- TrainingandTestData(data, percentage_test = 0.2)
TT$Training
TT$Test

## Subset Dataset
newData(data, nameX = "X", nameY = "Z")
splitdata(data, nameVariable = "X", min = 2, max= 3)


[Package MoTBFs version 1.4.1 Index]