clusterImage {WordListsAnalytics}R Documentation

This function receives a property listing task, a given concept, and a threshold. It clusterizes the data according to the order of the listed properties. Given the mentioned properties of all users for a specific concept, the algorithm estimates a similarity among properties, based on the number of words mentioned between properties. For example, if the properties A and B are usually mentioned one after another, their similarity will be higher than the properties A and C which are usually not even mentioned together. The properties with low similarity to all other properties (below the user-defined threshold) are discarded from the plot.

Description

This function receives a property listing task, a given concept, and a threshold. It clusterizes the data according to the order of the listed properties. Given the mentioned properties of all users for a specific concept, the algorithm estimates a similarity among properties, based on the number of words mentioned between properties. For example, if the properties A and B are usually mentioned one after another, their similarity will be higher than the properties A and C which are usually not even mentioned together. The properties with low similarity to all other properties (below the user-defined threshold) are discarded from the plot.

Usage

clusterImage(data, distThreshold, concept = NULL)

Arguments

data

Data frame with 3 columns: ID, Concept and Property

distThreshold

Distance value. It assign properties to specific cluster if their similarity is greater than distThreshold

concept

Text value. Clusters will only be generated with properties from this concept.

Value

List with 2 elements: ggplot2 plot and data frame with cluster information

Examples

data_cpn = data.frame(CPN_27)
threshold = 0.061
concept = "Ability"
cluster_data = clusterImage(data_cpn, threshold, concept)

[Package WordListsAnalytics version 0.2.2 Index]