newdata {semiArtificial} | R Documentation |
Generate semi-artificial data using a generator
Description
Using a generator build with rbfDataGen
or treeEnsemble
the method generates size
new instances.
Usage
## S3 method for class 'RBFgenerator'
newdata(object, size, var=c("estimated","Silverman"),
classProb=NULL, defaultSpread=0.05, ... )
## S3 method for class 'TreeEnsemble'
newdata(object, fillData=NULL,
size=ifelse(is.null(fillData),1,nrow(fillData)),
onlyPath=FALSE, classProb=NULL,
predictClass=FALSE, ...)
Arguments
object |
An object of class |
fillData |
A dataframe with part of the values already specified. All missing values (i.e. NA values) are filled in by the generator. |
size |
A number of instances to generate. By default this is one instance, or in the case of existing fillData this is the number of rows in that dataframe. |
var |
For the generator of type |
classProb |
For classification problems, a vector of desired class value probability distribution. Default value |
defaultSpread |
For the generator of type |
onlyPath |
For the generator of type |
predictClass |
For classification problems and the generator of type |
... |
Additional parameters passed to density estimation functions kde, logspline, and quantile. |
Details
The function uses the object
structure as returned by rbfDataGen
or treeEnsemble
.
In case of RBFgenerator
the object contains descriptions of the Gaussian kernels, which model the original data.
The kernels are used to generate a required number of new instances.
The kernel width of provided kernels can be set in two ways. By setting var="estimated"
the estimated spread of the
training instances that have the maximal activation value for the particular kernel is used.
Using var="Silverman"
width is set by the generalization of Silverman's rule of thumb to multivariate
case (unreliable for larger dimensions).
In case of TreeEnsemble generator no additional parameters are needed, except for the number of generated instances.
Value
The method returns a data.frame
object with required number of instances.
Author(s)
Marko Robnik-Sikonja
See Also
Examples
# inspect properties of the iris data set
plot(iris, col=iris$Species)
summary(iris)
# create RBF generator
irisRBF<- rbfDataGen(Species~.,iris)
# create treesemble generator
irisEnsemble<- treeEnsemble(Species~.,iris,noTrees=10)
# use the generator to create new data with both generators
irisNewRBF <- newdata(irisRBF, size=150)
irisNewEns <- newdata(irisEnsemble, size=150)
#inspect properties of the new data
plot(irisNewRBF, col = irisNewRBF$Species) #plot generated data
summary(irisNewRBF)
plot(irisNewEns, col = irisNewEns$Species) #plot generated data
summary(irisNewEns)