TrainBuddle {Buddle} R Documentation

Implementing Statistical Classification and Regression.

Description

Build a multi-layer feed-forward neural network model for statistical classification and regression analysis with random effects.

Usage

TrainBuddle(
formula.string,
data,
train.ratio = 0.7,
arrange = 0,
batch.size = 10,
total.iter = 10000,
hiddenlayer = c(100),
batch.norm = TRUE,
drop = TRUE,
drop.ratio = 0.1,
lr = 0.1,
init.weight = 0.1,
activation = c("Sigmoid"),
optim = "SGD",
type = "Classification",
rand.eff = FALSE,
distr = "Normal",
disp = TRUE
)

Arguments

 formula.string a formula string or a vector of numeric values. When it is a string, it denotes a classification or regression equation, of the form label ~ predictors or response ~ predictors, where predictors are separated by + operator. If it is a numeric vector, it will be a label or a response variable of a classification or regression equation, respectively. data a data frame or a design matrix. When formula.string is a string, data should be a data frame which includes the label (or the response) and the predictors expressed in the formula string. When formula.string is a vector, i.e. a vector of labels or responses, data should be an nxp numeric matrix whose columns are predictors for further classification or regression. train.ratio a ratio that is used to split data into training and test sets. When data is an n-by-p matrix, the resulting train data will be a (train.ratio x n)-by-p matrix. The default is 0.7. arrange a logical value to arrange data for the classification only (automatically set up to FALSE for regression) when splitting data into training and test sets. If it is true, data will be arranged for the resulting training set to contain the specified ratio (train.ratio) of labels of the whole data. See also Split2TrainTest(). batch.size a batch size used for training during iterations. total.iter a number of iterations used for training. hiddenlayer a vector of numbers of nodes in hidden layers. batch.norm a logical value to specify whether or not to use the batch normalization option for training. The default is TRUE. drop a logical value to specify whether or not to use the dropout option for training. The default is TRUE. drop.ratio a ratio for the dropout; used only if drop is TRUE. The default is 0.1. lr a learning rate. The default is 0.1. init.weight a weight used to initialize the weight matrix of each layer. The default is 0.1. activation a vector of activation functions used in all hidden layers. For two hidden layers (e.g., hiddenlayer=c(100, 50)), it is a vector of two activation functions, e.g., c("Sigmoid", "SoftPlus"). The list of available activation functions includes Sigmoid, Relu, LeakyRelu, TanH, ArcTan, ArcSinH, ElliotSig, SoftPlus, BentIdentity, Sinusoid, Gaussian, Sinc, and Identity. For details of the activation functions, please refer to Wikipedia. optim an optimization method which is used for training. The following methods are available: "SGD", "Momentum", "AdaGrad", "Adam", "Nesterov", and "RMSprop." type a statistical model for the analysis: "Classification" or "Regression." rand.eff a logical value to specify whether or not to add a random effect into classification or regression. distr a distribution of a random effect; used only if rand.eff is TRUE. The following distributions are available: "Normal", "Exponential", "Logistic", and "Cauchy." disp a logical value which specifies whether or not to display intermediate training results (loss and accuracy) during the iterations.

Value

A list of the following values:

lW

a list of n terms of weight matrices where n is equal to the number of hidden layers plus one.

lb

a list of n terms of bias vectors where n is equal to the number of hidden layers plus one.

lParam

a list of parameters used for the training process.

train.loss

a vector of loss values of the training set obtained during iterations where its length is eqaul to number of epochs.

train.accuracy

a vector of accuracy values of the training set obtained during during iterations where its length is eqaul to number of epochs.

test.loss

a vector of loss values of the test set obtained during the iterations where its length is eqaul to number of epochs.

test.accuracy

a vector of accuracy values of the test set obtained during the iterations where its length is eqaul to number of epochs.

predicted.softmax

an r-by-n numeric matrix where r is the number of labels (classification) or 1 (regression), and n is the size of the test set. Its entries are predicted softmax values (classification) or predicted values (regression) of the test sets, obtained by using the weight matrices (lW) and biases (lb).

predicted.encoding

an r-by-n numeric matrix which is a result of one-hot encoding of the predicted.softmax; valid for classification only.

confusion.matrix

an r-by-r confusion matrix; valid classification only.

precision

an (r+1)-by-3 matrix which reports precision, recall, and F1 of each label; valid classification only.

References

 Geron, A. Hand-On Machine Learning with Scikit-Learn and TensorFlow. Sebastopol: O'Reilly, 2017. Print.

 Han, J., Pei, J, Kamber, M. Data Mining: Concepts and Techniques. New York: Elsevier, 2011. Print.

 Weilman, S. Deep Learning from Scratch. O'Reilly Media, 2019. Print.

CheckNonNumeric(), GetPrecision(), FetchBuddle(), MakeConfusionMatrix(), OneHot2Label(), Split2TrainTest()

Examples

####################
# train.ratio = 0.6                    ## 60% of data is used for training
# batch.size = 10
# total.iter = 100
# hiddenlayer=c(20,10)                ## Use two hidden layers
# arrange=TRUE                         #### Use "arrange" option
# activations = c("Relu","SoftPlus")   ### Use Relu and SoftPlus
# optim = "Nesterov"                   ### Use the "Nesterov" method for the optimization.
# type = Classification
# rand.eff = TRUE                      #### Add some random effect
# distr="Normal"                       #### The random effect is a normal random variable
# disp = TRUE                          #### Display intemeidate results during iterations.

data(iris)

lst = TrainBuddle("Species~Sepal.Width+Petal.Width", iris, train.ratio=0.6,
arrange=TRUE, batch.size=10, total.iter = 100, hiddenlayer=c(20, 10),
batch.norm=TRUE, drop=TRUE, drop.ratio=0.1, lr=0.1, init.weight=0.1,
activation=c("Relu","SoftPlus"), optim="Nesterov",
type = "Classification", rand.eff=TRUE, distr = "Normal", disp=TRUE)

lW = lst\$lW
lb = lst\$lb
lParam = lst\$lParam

confusion.matrix = lst\$confusion.matrix
precision = lst\$precision

confusion.matrix
precision

### Another classification example
### Using mnist data

data(mnist_data)

Img_Mat = mnist_data\$Images
Img_Label = mnist_data\$Labels

##### Use 100 images

X = Img_Mat                   ### X: 100 x 784 matrix
Y = Img_Label                 ### Y: 100 x 1 vector

lst = TrainBuddle(Y, X, train.ratio=0.6, arrange=TRUE, batch.size=10, total.iter = 100,
hiddenlayer=c(20, 10), batch.norm=TRUE, drop=TRUE,
drop.ratio=0.1, lr=0.1, init.weight=0.1,
type = "Classification", rand.eff=TRUE, distr = "Logistic", disp=TRUE)

confusion.matrix = lst\$confusion.matrix
precision = lst\$precision

confusion.matrix
precision

###############   Regression example

n=100
p=10
X = matrix(rnorm(n*p, 1, 1), n, p)  ## X is a 100-by-10 design matrix
b = matrix( rnorm(p, 1, 1), p,1)
e = matrix(rnorm(n, 0, 1), n,1)
Y = X %*% b + e                     ### Y=X b + e
######### train.ratio=0.7
######### batch.size=20
######### arrange=FALSE
######### total.iter = 100
######### hiddenlayer=c(20)
######### activation = c("Identity")