classic2sym {ggESDA}R Documentation

Convert classical data frame into a symbolic data.

Description

A function for converting a classical data, which may present as a data frame or a matrix with one entry one value, into a symbolic data,which is shown as a interval or a set in an entry. Object after converting is ggESDA class containing interval data and raw data(if it exist) and typically statistics.

Usage

classic2sym(data=NULL,groupby = "kmeans",k=5,minData=NULL,maxData=NULL,
modalData = NULL)

Arguments

data

A classical data frame that you want to be converted into a interval data

groupby

A way to aggregate. It can be either a clustering method or a variable name which exist in input data (necessary factor type) . Default "kmeans".

k

A number of group,which is used by clustering. Default k = 5.

minData

if choose groupby parameter as 'customize',user need to define which data is min data or max data.

maxData

if choose groupby parameter as 'customize',user need to define which data is min data or max data.

modalData

list, each cell of list contain a set of column index of its modal multi-valued data of the input data. the value of it is a proportion presentation, and sum of each row in these column must be equal to 1. ex 0,1,0 or 0.2,0.3,0.5. the input type of modalData for example is modalData[[1]] = c(2, 3), modalData[[2]] = c(7:10), that 2, 3, 7, 8, 9, 10 columns are modal type of the data. Note: the option is only valid when groupby == "customize".

Value

classic2sym returns an object of class "ggESDA",which have a interval data and others as follows.

Examples

#classical data to symbolic data
classic2sym(iris)
classic2sym(mtcars, groupby = "kmeans", k = 10)
classic2sym(iris, groupby = "hclust", k = 7)
classic2sym(iris, groupby = "Species")

x1<-runif(10, -30, -10)
y1<-runif(10, -10, 30)
x2<-runif(10, -5, 5)
y2<-runif(10, 10, 50)
x3<-runif(10, -50, 30)
y3<-runif(10, 31, 60)

d<-data.frame(min1=x1,max1=y1,min2=x2,max2=y2,min3=x3,max3=y3)
classic2sym(d,groupby="customize",minData=d[,c(1,3,5)],maxData=d[,c(2,4,6)])
classic2sym(d,groupby="customize",minData=d$min1,maxData=d$min2)


#example for build modal data
#for the first modal data proportion
a1 <- runif(10, 0,0.4) %>% round(digits = 1)
a2 <- runif(10, 0,0.4) %>% round(digits = 1)

#for the second modal data proportion
b1 <- runif(10, 0,0.4) %>% round(digits = 1)
b2 <- runif(10, 0,0.4) %>% round(digits = 1)

#for interval-valued data
c1 <- runif(10, 10, 20) %>% round(digits = 0)
c2 <- runif(10, -50, -10) %>% round(digits = 0)

#build simulated data
d <- data.frame(a1 = a1, a2 = a2, a3 = 1-(a1+a2),
                c1 = c1, c2 = c2,
                b1 = b1, b2 = b2, b3 = 1-(b1+b2))

#transformation
classic2sym(d, groupby = "customize",
            minData = d$c2,
            maxData = d$c1,
            modalData = list(1:3, 6:8))#two modal data

#extract the data
symObj<-classic2sym(iris)
symObj$intervalData       #interval data
symObj$rawData            #raw data
symObj$clusterResult      #cluster result
symObj$statisticsDF       #statistics

[Package ggESDA version 0.2.0 Index]