clusterImage {WordListsAnalytics} | R Documentation |
This function receives a property listing task, a given concept, and a threshold. It clusterizes the data according to the order of the listed properties. Given the mentioned properties of all users for a specific concept, the algorithm estimates a similarity among properties, based on the number of words mentioned between properties. For example, if the properties A and B are usually mentioned one after another, their similarity will be higher than the properties A and C which are usually not even mentioned together. The properties with low similarity to all other properties (below the user-defined threshold) are discarded from the plot.
Description
This function receives a property listing task, a given concept, and a threshold. It clusterizes the data according to the order of the listed properties. Given the mentioned properties of all users for a specific concept, the algorithm estimates a similarity among properties, based on the number of words mentioned between properties. For example, if the properties A and B are usually mentioned one after another, their similarity will be higher than the properties A and C which are usually not even mentioned together. The properties with low similarity to all other properties (below the user-defined threshold) are discarded from the plot.
Usage
clusterImage(data, distThreshold, concept = NULL)
Arguments
data |
Data frame with 3 columns: ID, Concept and Property |
distThreshold |
Distance value. It assign properties to specific cluster if their similarity is greater than distThreshold |
concept |
Text value. Clusters will only be generated with properties from this concept. |
Value
List with 2 elements: ggplot2 plot and data frame with cluster information
Examples
data_cpn = data.frame(CPN_27)
threshold = 0.061
concept = "Ability"
cluster_data = clusterImage(data_cpn, threshold, concept)