getConsensusClusters {clickstream} | R Documentation |
Generates an optimal set of clusters for a clickstream object based on consensus clustering.
Description
This is an experimental function for a consensus clustering algorithm based on targeting a range of average next state probabilities derived when fitting each cluster to a markov chain.
Usage
getConsensusClusters(
trainingCLS,
testCLS,
maxIterations = 5,
optimalProbMean = 0.5,
range = 0.3,
centresMin = 2,
clusterCentresRange = 0,
order = 1,
takeHighest = FALSE,
verbose = FALSE
)
Arguments
trainingCLS |
Clickstream object with training data (this should be the data used to build the markov chain object). |
testCLS |
Clickstream object with test data. |
maxIterations |
Number of times to iterate (repeat) through the k-means clustering. |
optimalProbMean |
The target average probability of each next page click prediction in a 1st order markov chain. |
range |
The range above the optimal probability to target. |
centresMin |
The minimum cluster centres to evaluate. |
clusterCentresRange |
the additional cluster centres to evaluate. |
order |
The order for markov chains that will be used to evaluate each cluster. |
takeHighest |
Determines whether to default to the highest mean next click probability, or error if the target is not reached after the given number of k-means iterations. |
verbose |
Should this function report extra information on progress? |
Author(s)
Theo van Kraay theo.vankraay@hotmail.com
Examples
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o",
"User2,i,c,i,c,c,c,d",
"User3,h,i,c,i,c,p,c,c,p,c,c,i,d",
"User4,h,c,c,p,p,c,p,p,p,i,p,o",
"User5,i,h,c,c,p,p,c,p,c,d",
"User6,i,h,c,c,p,p,c,p,c,o",
"User7,i,h,c,c,p,p,c,p,c,d",
"User8,i,h,c,c,p,p,c,p,c,d,o")
test <- c(
"User1,h,c,c,p,c,h,c,p,p,c,p,p,o",
"User2,i,c,i,c,c,c,d",
"User3,h,i,c,i,c,p,c,c,p,c,c,i,d"
)
trainingCLS <- as.clickstreams(training, header = TRUE)
testCLS <- as.clickstreams(test, header = TRUE)
clusters <- getConsensusClusters(trainingCLS, testCLS, maxIterations=5,
optimalProbMean=0.40, range = 0.70, centresMin = 2,
clusterCentresRange = 0, order = 1, takeHighest = FALSE,
verbose = FALSE)
markovchains <- fitMarkovChains(clusters)
startPattern <- new("Pattern", sequence = c("i", "h", "c", "p"))
mc <- getOptimalMarkovChain(startPattern, markovchains, clusters)
predict(mc, startPattern)