findThreshold {stmCorrViz} | R Documentation |
Find appropriate threshold range
Description
This function performs a grid search over potential clustering thresholds to identify a valid range, and inspect the varying levels of aggregation within it.
Usage
findThreshold(mod, documents_raw=NULL, documents_matrix=NULL,
range_min=.05, range_max=5, step=.05)
Arguments
mod |
A fitted |
documents_raw |
The raw documents used to generate the STM model. A character vector where each entry is the full text of a document. |
documents_matrix |
Document-term matrix representation of the raw documents, as generated by the |
range_min |
Lower bound of the range to be searched. |
range_max |
Upper bound of the range to be searched. |
step |
Step size for the grid search. |
Value
A data frame containing the following columns:
-
threshold: Threshold value.
-
valid: Binary value; 1 if clustering is successful using given threshold; 0 if not.
-
juncture_points: Number of juncture points in the resulting clustering tree; -1 if run is unsuccessful. Lower threshold values yield a higher number of juncture points, corresponding to more binary splits and deeper trees. Higher threshold values produce fewer juncture points, corresponding to trees that have significant breadth rather than depth.