R: Generate PMML for a neighbr object from the *neighbr*...

pmml.neighbr {pmml}

R Documentation

Generate PMML for a neighbr object from the neighbr package.

Description

Generate PMML for a neighbr object from the neighbr package.

Usage

## S3 method for class 'neighbr'
pmml(
  model,
  model_name = "kNN_model",
  app_name = "SoftwareAG PMML Generator",
  description = "K Nearest Neighbors Model",
  copyright = NULL,
  model_version = NULL,
  transforms = NULL,
  missing_value_replacement = NULL,
  ...
)

Arguments

`model`	A neighbr object.
`model_name`	A name to be given to the PMML model.
`app_name`	The name of the application that generated the PMML.
`description`	A descriptive text for the Header element of the PMML.
`copyright`	The copyright notice for the model.
`model_version`	A string specifying the model version.
`transforms`	Data transformations.
`missing_value_replacement`	Value to be used as the 'missingValueReplacement' attribute for all MiningFields.
`...`	Further arguments passed to or from other methods.

Details

The model is represented in the PMML NearestNeighborModel format.

The current version of this converter does not support transformations (transforms must be left as NULL), sets categoricalScoringMethod to "majorityVote", sets continuousScoringMethod to "average", and isTransoformed to "false".

Value

PMML representation of the neighbr object.

Examples

## Not run: 

# Continuous features with continuous target, categorical target,
# and neighbor ranking:

library(neighbr)
data(iris)

# Add an ID column to the data for neighbor ranking:
iris$ID <- c(1:150)

# Train set contains all predicted variables, features, and ID column:
train_set <- iris[1:140, ]

# Omit predicted variables and ID column from test set:
test_set <- iris[141:150, -c(4, 5, 6)]

fit <- knn(
  train_set = train_set, test_set = test_set,
  k = 3,
  categorical_target = "Species",
  continuous_target = "Petal.Width",
  comparison_measure = "squared_euclidean",
  return_ranked_neighbors = 3,
  id = "ID"
)

fit_pmml <- pmml(fit)


# Logical features with categorical target and neighbor ranking:

library(neighbr)
data("houseVotes84")

# Remove any rows with N/A elements:
dat <- houseVotes84[complete.cases(houseVotes84), ]

# Change all {yes,no} factors to {0,1}:
feature_names <- names(dat)[!names(dat) %in% c("Class", "ID")]
for (n in feature_names) {
  levels(dat[, n])[levels(dat[, n]) == "n"] <- 0
  levels(dat[, n])[levels(dat[, n]) == "y"] <- 1
}

# Change factors to numeric:
for (n in feature_names) {
  dat[, n] <- as.numeric(levels(dat[, n]))[dat[, n]]
}

# Add an ID column for neighbor ranking:
dat$ID <- c(1:nrow(dat))

# Train set contains features, predicted variable, and ID:
train_set <- dat[1:225, ]

# Test set contains features only:
test_set <- dat[226:232, !names(dat) %in% c("Class", "ID")]

fit <- knn(
  train_set = train_set, test_set = test_set,
  k = 5,
  categorical_target = "Class",
  comparison_measure = "jaccard",
  return_ranked_neighbors = 3,
  id = "ID"
)

fit_pmml <- pmml(fit)

## End(Not run)

[Package pmml version 2.5.2 Index]