compare_neighborhoods {kanjistat} | R Documentation |
Compare distances of nearest kanji
Description
List distances to nearest neighbors of a given kanji in terms of a reference distance (which is currently only the stroke edit distance) and compare with values in terms of another distance (currently only the component transport distance, a.k.a. kanji distance).
Usage
compare_neighborhoods(
kan,
refdist = "strokedit",
refnn = 10,
compdist = "kanjidist",
compnn = 0,
...
)
Arguments
kan |
a kanji (currently only as a single UTF-8 character). |
refdist |
the name of the reference distance (currently only "strokedit"). |
refnn |
the number of nearest neighbors in terms of the reference distance. |
compdist |
a character vector. The name(s) of one or several other distances to compare with (currently only "kanjidist"). |
compnn |
the number of nearest neighbors in terms of the other distance(s). If this is positive it is assumed that the suggested package kanjistat.data is available. |
... |
further parameters that are passed to |
Value
A matrix of distances with refnn + compnn
columns named by the nearest neighbors of kan
(first
in terms of the reference distance, then the other distances) and 1 + length(compdist)
rows named
by the type of distance.
Warning
This is only a first draft of the function and its interface and details may change considerably in the future.
As there is currently no precomputed kanjidist matrix, there is a huge difference in computation time between
setting compnn = 0
(only kanji distances to the refnn
nearest neighbors in terms of refdist
have to be
computed) and setting compnn
to any value $> 0$ (kanji distances to all 2135 other Jouyou kanji have to be
computed in order to determine the compnn
nearest neighbors; depending on the system and parameter settings
this can take (roughly) anywhere between 2 minutes and an hour).
Examples
# compare_neighborhoods("\u6674", refnn=5, compo_seg_depth=4, approx="pcweighted",
# compnn=0, minor_warnings=FALSE)