nysiis {phonics}R Documentation

New York State Identification and Intelligence System


The NYSIIS phonetic algorithm


nysiis(word, maxCodeLen = 6, modified = FALSE, clean = TRUE)



string or vector of strings to encode


maximum length of the resulting encodings, in characters


if TRUE, use the modified NYSIIS algorithm


if TRUE, return NA for unknown alphabetical characters


The nysiis function phentically encodes the given string using the New York State Identification and Intelligence System (NYSIIS) algorithm. The algorithm is based on the implementation provided by Wikipedia and is implemented in pure R using regular expressions.

The variable maxCodeLen is the limit on how long the returned NYSIIS code should be. The default is 6.

The variable modified directs nysiis to use the modified method instead of the original.

The nysiis algorithm is only defined for inputs over the standard English alphabet, i.e., "A-Z.". Non-alphabetical characters are removed from the string in a locale-dependent fashion. This strips spaces, hyphens, and numbers. Other letters, such as "Ü," may be permissible in the current locale but are unknown to nysiis. For inputs outside of its known range, the output is undefined and NA is returned and a warning this thrown. If clean is FALSE, nysiis attempts to process the strings. The default is TRUE.


the NYSIIS encoded character vector


James P. Howard, II, "Phonetic Spelling Algorithm Implementations for R," Journal of Statistical Software, vol. 25, no. 8, (2020), p. 1–21, <10.18637/jss.v095.i08>.

Robert L. Taft, Name search techniques, Bureau of Systems Development, Albany, New York, 1970.

See Also

Other phonics: caverphone(), cologne(), lein(), metaphone(), mra_encode(), onca(), phonex(), phonics(), rogerroot(), soundex(), statcan()


nysiis(c("Alabama", "Alaska"), modified = TRUE)
nysiis("mississippi", 4)

[Package phonics version 1.3.10 Index]