best.guess {SemNetCleaner} | R Documentation |
Makes Best Guess for Spelling Correction
Description
A wrapper function for the best guess of a spelling mistake
based on the letters, the ordering of those letters, and the potential
for letters to be interchanged. The
Damerau-Levenshtein distance
is used to guide inferences into what word the participant was trying to spell from a dictionary
(see SemNetDictionaries
)
Usage
best.guess(word, full.dictionary, dictionary = NULL, tolerance = 1)
Arguments
word |
Character. A word to get best guess spelling options from dictionary |
full.dictionary |
Character vector.
The dictionary to search for best guesses in.
See |
dictionary |
Character.
A dictionary from |
tolerance |
Numeric.
The distance tolerance set for automatic spell-correction purposes.
This function uses the function Unique words (i.e., n = 1) that are within the (distance) tolerance are automatically output as best guess responses. This default is based on Damerau's (1964) proclamation that more than 80% of all human misspellings can be expressed by a single error (e.g., insertion, deletion, substitution, and transposition). If there is more than one word that is within or below the distance tolerance, then these will be provided as potential options. The recommended and default distance tolerance is |
Value
The best guess(es) of the word
Author(s)
Alexander Christensen <alexpaulchristensen@gmail.com>
References
Damerau, F. J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM, 7, 171-176.
Examples
# Misspelled "bombay"
best.guess("bomba", full.dictionary = SemNetDictionaries::animals.dictionary)