R: Identifying discriminating subsequences

seqecmpgroup {TraMineR}

R Documentation

Identifying discriminating subsequences

Description

Identify and sort the most discriminating subsequences by their discriminating power.

Usage

seqecmpgroup(subseq, group, method="chisq", pvalue.limit=NULL,
             weighted = TRUE)

Arguments

`subseq`	A `subseqelist` object (list of subsequences) such as produced by `seqefsub`
`group`	Group membership, i.e., a variable or factor defining the groups which we want to discriminate
`method`	The discrimination method; one of `"bonferroni"` or `"chisq"`
`pvalue.limit`	Can be used to filter the results. Only subsequences with a p-value lower than this parameter are selected. If `NULL` all subsequences are returned (regardless of their p-values).
`weighted`	Logical. If `TRUE`, `seqecmpgroup` uses the weights specified in `subseq`, (see `seqefsub`).

Details

The following discrimination test functions are implemented: chisq, the Pearson Independence Chi-squared test, and bonferroni, the Pearson Independence Chi-squared test with Bonferroni correction.

Value

An objet of type subseqelistchisq (subtype of subseqelist) with the following elements

`subseq`	Sorted list of found discriminating subsequences
`eseq`	The event sequence object on which the tests were computed
`constraint`	Time constraints used for searching the subsequences (see `seqeconstraint`)
`labels`	Levels (value labels) of the target group variable
`type`	Type of test used
`data`	A data frame with columns support, index (original rank of the subsequence, i.e., its position in the inputted `subseq`) and a pair of frequency and Pearson residual columns for each group

Author(s)

Matthias Studer (with Gilbert Ritschard for the help page)

References

Studer, M., Müller, N.S., Ritschard, G. & Gabadinho, A. (2010), "Classer, discriminer et visualiser des séquences d'événements", In Extraction et gestion des connaissances (EGC 2010), Revue des nouvelles technologies de l'information RNTI. Vol. E-19, pp. 37-48.

Ritschard, G., Bürgin, R., and Studer, M. (2014), "Exploratory Mining of Life Event Histories", In McArdle, J.J. & Ritschard, G. (eds) Contemporary Issues in Exploratory Data Mining in the Behavioral Sciences. Series: Quantitative Methodology, pp. 221-253. New York: Routledge.

Examples

data(actcal.tse)
actcal.eseq <- seqecreate(actcal.tse)

##Searching for frequent subsequences, that is, appearing at least 20 times
fsubseq <- seqefsub(actcal.eseq, pmin.support=0.01)

##searching for susbsequences discriminating the most men and women
data(actcal)
discr <- seqecmpgroup(fsubseq, group=actcal$sex, method="bonferroni")
##Printing the six most discriminating subsequences
print(discr[1:6])
##Plotting the six most discriminating subsequences
plot(discr[1:6])

[Package TraMineR version 2.2-10 Index]