classic2sym {ggESDA} | R Documentation |
Convert classical data frame into a symbolic data.
Description
A function for converting a classical data, which may present as a data frame or a matrix with one entry one value, into a symbolic data,which is shown as a interval or a set in an entry. Object after converting is ggESDA class containing interval data and raw data(if it exist) and typically statistics.
Usage
classic2sym(data=NULL,groupby = "kmeans",k=5,minData=NULL,maxData=NULL,
modalData = NULL)
Arguments
data |
A classical data frame that you want to be converted into a interval data |
groupby |
A way to aggregate. It can be either a clustering method or a variable name which exist in input data (necessary factor type) . Default "kmeans". |
k |
A number of group,which is used by clustering. Default k = 5. |
minData |
if choose groupby parameter as 'customize',user need to define which data is min data or max data. |
maxData |
if choose groupby parameter as 'customize',user need to define which data is min data or max data. |
modalData |
list, each cell of list contain a set of column index of its modal multi-valued data of the input data. the value of it is a proportion presentation, and sum of each row in these column must be equal to 1. ex 0,1,0 or 0.2,0.3,0.5. the input type of modalData for example is modalData[[1]] = c(2, 3), modalData[[2]] = c(7:10), that 2, 3, 7, 8, 9, 10 columns are modal type of the data. Note: the option is only valid when groupby == "customize". |
Value
classic2sym returns an object of class "ggESDA",which have a interval data and others as follows.
intervalData - The Interval data after converting also known as a RSDA object.
rawData - Classical data that user input.
clusterResult - Cluster results .If the groupby method is a clustering method then it will exist.
statisticsDF - A list contains data frame including some typically statistics in each group.
Examples
#classical data to symbolic data
classic2sym(iris)
classic2sym(mtcars, groupby = "kmeans", k = 10)
classic2sym(iris, groupby = "hclust", k = 7)
classic2sym(iris, groupby = "Species")
x1<-runif(10, -30, -10)
y1<-runif(10, -10, 30)
x2<-runif(10, -5, 5)
y2<-runif(10, 10, 50)
x3<-runif(10, -50, 30)
y3<-runif(10, 31, 60)
d<-data.frame(min1=x1,max1=y1,min2=x2,max2=y2,min3=x3,max3=y3)
classic2sym(d,groupby="customize",minData=d[,c(1,3,5)],maxData=d[,c(2,4,6)])
classic2sym(d,groupby="customize",minData=d$min1,maxData=d$min2)
#example for build modal data
#for the first modal data proportion
a1 <- runif(10, 0,0.4) %>% round(digits = 1)
a2 <- runif(10, 0,0.4) %>% round(digits = 1)
#for the second modal data proportion
b1 <- runif(10, 0,0.4) %>% round(digits = 1)
b2 <- runif(10, 0,0.4) %>% round(digits = 1)
#for interval-valued data
c1 <- runif(10, 10, 20) %>% round(digits = 0)
c2 <- runif(10, -50, -10) %>% round(digits = 0)
#build simulated data
d <- data.frame(a1 = a1, a2 = a2, a3 = 1-(a1+a2),
c1 = c1, c2 = c2,
b1 = b1, b2 = b2, b3 = 1-(b1+b2))
#transformation
classic2sym(d, groupby = "customize",
minData = d$c2,
maxData = d$c1,
modalData = list(1:3, 6:8))#two modal data
#extract the data
symObj<-classic2sym(iris)
symObj$intervalData #interval data
symObj$rawData #raw data
symObj$clusterResult #cluster result
symObj$statisticsDF #statistics