pmml.neighbr {pmml} | R Documentation |
Generate PMML for a neighbr object from the neighbr package.
Description
Generate PMML for a neighbr object from the neighbr package.
Usage
## S3 method for class 'neighbr'
pmml(
model,
model_name = "kNN_model",
app_name = "SoftwareAG PMML Generator",
description = "K Nearest Neighbors Model",
copyright = NULL,
model_version = NULL,
transforms = NULL,
missing_value_replacement = NULL,
...
)
Arguments
model |
A neighbr object. |
model_name |
A name to be given to the PMML model. |
app_name |
The name of the application that generated the PMML. |
description |
A descriptive text for the Header element of the PMML. |
copyright |
The copyright notice for the model. |
model_version |
A string specifying the model version. |
transforms |
Data transformations. |
missing_value_replacement |
Value to be used as the 'missingValueReplacement' attribute for all MiningFields. |
... |
Further arguments passed to or from other methods. |
Details
The model is represented in the PMML NearestNeighborModel format.
The current version of this converter does not support transformations (transforms
must be left as NULL
), sets categoricalScoringMethod
to "majorityVote", sets
continuousScoringMethod
to "average", and isTransoformed
to "false".
Value
PMML representation of the neighbr object.
See Also
Examples
## Not run:
# Continuous features with continuous target, categorical target,
# and neighbor ranking:
library(neighbr)
data(iris)
# Add an ID column to the data for neighbor ranking:
iris$ID <- c(1:150)
# Train set contains all predicted variables, features, and ID column:
train_set <- iris[1:140, ]
# Omit predicted variables and ID column from test set:
test_set <- iris[141:150, -c(4, 5, 6)]
fit <- knn(
train_set = train_set, test_set = test_set,
k = 3,
categorical_target = "Species",
continuous_target = "Petal.Width",
comparison_measure = "squared_euclidean",
return_ranked_neighbors = 3,
id = "ID"
)
fit_pmml <- pmml(fit)
# Logical features with categorical target and neighbor ranking:
library(neighbr)
data("houseVotes84")
# Remove any rows with N/A elements:
dat <- houseVotes84[complete.cases(houseVotes84), ]
# Change all {yes,no} factors to {0,1}:
feature_names <- names(dat)[!names(dat) %in% c("Class", "ID")]
for (n in feature_names) {
levels(dat[, n])[levels(dat[, n]) == "n"] <- 0
levels(dat[, n])[levels(dat[, n]) == "y"] <- 1
}
# Change factors to numeric:
for (n in feature_names) {
dat[, n] <- as.numeric(levels(dat[, n]))[dat[, n]]
}
# Add an ID column for neighbor ranking:
dat$ID <- c(1:nrow(dat))
# Train set contains features, predicted variable, and ID:
train_set <- dat[1:225, ]
# Test set contains features only:
test_set <- dat[226:232, !names(dat) %in% c("Class", "ID")]
fit <- knn(
train_set = train_set, test_set = test_set,
k = 5,
categorical_target = "Class",
comparison_measure = "jaccard",
return_ranked_neighbors = 3,
id = "ID"
)
fit_pmml <- pmml(fit)
## End(Not run)