seqecmpgroup {TraMineR} | R Documentation |
Identifying discriminating subsequences
Description
Identify and sort the most discriminating subsequences by their discriminating power.
Usage
seqecmpgroup(subseq, group, method="chisq", pvalue.limit=NULL,
weighted = TRUE)
Arguments
subseq |
A |
group |
Group membership, i.e., a variable or factor defining the groups which we want to discriminate |
method |
The discrimination method; one of |
pvalue.limit |
Can be used to filter the results. Only subsequences with a p-value lower than this parameter are selected. If |
weighted |
Logical. If |
Details
The following discrimination test functions are implemented:
chisq
, the Pearson Independence Chi-squared test, and
bonferroni
, the Pearson Independence Chi-squared test with Bonferroni correction.
Value
An objet of type subseqelistchisq
(subtype of subseqelist
) with the following elements
subseq |
Sorted list of found discriminating subsequences |
eseq |
The event sequence object on which the tests were computed |
constraint |
Time constraints used for searching the subsequences (see |
labels |
Levels (value labels) of the target group variable |
type |
Type of test used |
data |
A data frame with columns support, index (original rank of the subsequence, i.e., its position in the inputted |
Author(s)
Matthias Studer (with Gilbert Ritschard for the help page)
References
Studer, M., Müller, N.S., Ritschard, G. & Gabadinho, A. (2010), "Classer, discriminer et visualiser des séquences d'événements", In Extraction et gestion des connaissances (EGC 2010), Revue des nouvelles technologies de l'information RNTI. Vol. E-19, pp. 37-48.
Ritschard, G., Bürgin, R., and Studer, M. (2014), "Exploratory Mining of Life Event Histories", In McArdle, J.J. & Ritschard, G. (eds) Contemporary Issues in Exploratory Data Mining in the Behavioral Sciences. Series: Quantitative Methodology, pp. 221-253. New York: Routledge.
See Also
See also plot.subseqelistchisq
to plot the results
Examples
data(actcal.tse)
actcal.eseq <- seqecreate(actcal.tse)
##Searching for frequent subsequences, that is, appearing at least 20 times
fsubseq <- seqefsub(actcal.eseq, pmin.support=0.01)
##searching for susbsequences discriminating the most men and women
data(actcal)
discr <- seqecmpgroup(fsubseq, group=actcal$sex, method="bonferroni")
##Printing the six most discriminating subsequences
print(discr[1:6])
##Plotting the six most discriminating subsequences
plot(discr[1:6])