Feed_Reduction {LilRhino} | R Documentation |
A Function for converting data into approximations of probability space.
Description
It takes the number of unique labels in the training data and tries to predict a one vs all binary neural network for each unique label. The output is an approximation of the probability that each individual input does not not match the label. Travis Barton (2018) http://wbbpredictions.com/wp-content/uploads/2018/12/Redditbot_Paper.pdf
Usage
Feed_Reduction(X, Y, X_test, val_split = .1,
nodes = NULL, epochs = 15,
batch_size = 30, verbose = 0)
Arguments
X |
Training data |
Y |
Training labels |
X_test |
Testing data |
val_split |
The validation split for the keras, binary, neural networks |
nodes |
The number nodes for the hidden layers, default is 1/4 of the length of the training data. |
epochs |
The number of epochs for the fitting of the networks |
batch_size |
The batch size for the networks |
verbose |
Weither or not you want details about the run as its happening. 0 = silent, 1 = progress bar, 2 = one line per epoch. |
Details
This is a new technique for dimensionality reduction of my own creation. Data is converted to the same number of dimensions as there are unique labels. Each dimension is an approximation of the probability that the data point is inside the a unique label. The return value is a list the training and test data with their dimensionality reduced.
Value
Train |
The training data in the new probability space |
Test |
The testing data in the new probability space |
Author(s)
Travis Barton.
References
Check out http://wbbpredictions.com/wp-content/uploads/2018/12/Redditbot_Paper.pdf for details on the proccess
See Also
Binary_Network
Examples
## Not run:
if(8 * .Machine$sizeof.pointer == 64){
#Feed Network Testing
library(keras)
install_keras()
dat <- keras::dataset_mnist()
X_train = array_reshape(dat$train$x/255, c(nrow(dat$train$x/255), 784))
y_train = dat$train$y
X_test = array_reshape(dat$test$x/255, c(nrow(dat$test$x/255), 784))
y_test = dat$test$y
Reduced_Data2 = Feed_Reduction(X_train, y_train, X_test,
val_split = .3, nodes = 350,
30, 50, verbose = 1)
library(e1071)
names(Reduced_Data2$test) = names(Reduced_Data2$train)
newdat = as.data.frame(cbind(rbind(Reduced_Data2$train, Reduced_Data2$test), c(y_train, y_test)))
colnames(newdat) = c(paste("V", c(1:11), sep = ""))
mod = svm(V11~., data = newdat, subset = c(1:60000),
kernel = 'linear', cost = 1, type = 'C-classification')
preds = predict(mod, newdat[60001:70000,-11])
sum(preds == y_test)/10000
}
## End(Not run)