NormDiscreteXform {pmmlTransformations}R Documentation

Normalize discrete values in accordance to the PMML element:
NormDiscrete

Description

Define a new derived variable for each possible value of a categorical variable. Given a categorical variable catVar with possible discrete values A and B, this will create 2 derived variables catVar_A and catVar_B. If, for example, the input value of catVar is A then catVar_A equals 1 and catVar_B equals 0.

Usage

NormDiscreteXform(boxdata, xformInfo=NA, 
                  inputVar=NA, mapMissingTo=NA, ...)

Arguments

boxdata

the wrapper object obtained by using the WrapData function on the raw data.

xformInfo

specification of details of the transformation: the name of the input variable to be transformed.

inputVar

the input variable name in the data on which the transformation is to be applied

mapMissingTo

value to be given to the transformed variable if the value of the input variable is missing.

...

further arguments passed to or from other methods.

Details

Given an input variable, InputVar and missingVal, the desired value of the transformed variable if the input variable value is missing, the NormDiscreteXform command including all optional parameters is in the format:

xformInfo="inputVar=input_variable, mapMissingTo=missingVal"

There are two methods in which the input variable can be referred to. The first method is to use its column number; given the data attribute of the boxData object, this would be the order at which the variable appears. This can be indicated in the format "column#". The second method is to refer to the variable by its name.

The xformInfo and inputVar parameters provide the same information. While either one may be used when using this function, at least one of them is required. If both parameters are given, the inputVar parameter is used as the default.

The output of this transformation is a set of transformed variables, one for each possible value of the input variable. For example, given possible values of the input variable val1, val2, ... these transformed variables are by default named InputVar_val1, InputVar_val2, ...

Value

R object containing the raw data, the transformed data and data statistics.

Author(s)

Tridivesh Jena, Zementis, Inc.

See Also

WrapData

Examples

# Load the standard iris dataset, already available in R
   data(iris)

# First wrap the data
   irisBox <- WrapData(iris)

# Discretize the "Species" variable. This will find all possible 
# values of the "Species" variable and define new variables. The 
# parameter name used here should be replaced by the new preferred 
# parameter name as shown in the next example below.
#
# 	"Species_setosa" such that it is 1 if 
#      "Species" equals "setosa", else 0;
# 	"Species_versicolor" such that it is 1 if 
#      "Species" equals "versicolor", else 0;
# 	"Species_virginica" such that it is 1 if 
#      "Species" equals "virginica", else 0

  irisBox <- NormDiscreteXform(irisBox,inputVar="Species")
  
# Exact same operation performed with a different parameter name. 
# Use of this new parameter is the preferred method as the previous 
# parameter will be deprecated soon.

  irisBox <- WrapData(iris)
  irisBox <- NormDiscreteXform(irisBox,xformInfo="Species")
  

[Package pmmlTransformations version 1.3.3 Index]