LCS {qualV} | R Documentation |
Algorithm for the Longest Common Subsequence Problem
Description
Determines the longest common subsequence of two strings.
Usage
LCS(a, b)
Arguments
a |
vector (numeric or character), missing values are not accepted |
b |
vector (numeric or character), missing values are not accepted |
Details
A longest common subsequence (LCS
) is a common subsequence
of two strings of maximum length. The LCS
Problem consists of
finding a LCS
of two given strings and its length
(LLCS
). A qualitative similarity index QSI
is
computed by division of the LLCS
over maximum length of
'a'
and 'b'
.
Value
a |
vector |
b |
vector |
LLCS |
length of |
LCS |
longest common subsequence |
QSI |
quality similarity index |
va |
one possible |
vb |
one possible |
Note
LCS
is now using a C version of the algorithm provided by
Dominik Reusser.
References
Wagner, R. A. and Fischer, M. J. (1974) The String-to-String Correction Problem. Journal of the ACM, 21, 168-173.
Paterson, M. and Dancik, V. (1994) Longest Common Subsequences. Mathematical Foundations of Computer Science, 841, 127-142.
Gusfield, D. (1997) Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, England, ISBN 0-521-58519-8.
Examples
# direct use
a <- c("b", "c", "a", "b", "c", "b")
b <- c("a", "b", "c", "c", "b")
print(LCS(a, b))
# a constructed example
x <- seq(0, 2 * pi, 0.1) # time
y <- 5 + sin(x) # a process
o <- y + rnorm(x, sd=0.2) # observation with random error
p <- y + 0.1 # simulation with systematic bias
plot(x, o); lines(x, p)
lcs <- LCS(f.slope(x, o), f.slope(x, p)) # too much noise
lcs$LLCS
lcs$QSI
os <- ksmooth(x, o, kernel = "normal", bandwidth = dpill(x, o), x.points = x)$y
lcs <- LCS(f.slope(x, os), f.slope(x, p))
lcs$LLCS
lcs$QSI
# observed and measured data with non-matching time intervals
data(phyto)
bbobs <- dpill(obs$t, obs$y)
n <- tail(obs$t, n = 1) - obs$t[1] + 1
obsdpill <- ksmooth(obs$t, obs$y, kernel = "normal", bandwidth = bbobs,
n.points = n)
obss <- data.frame(t = obsdpill$x, y = obsdpill$y)
obss <- obss[match(sim$t, obss$t),]
obs_f1 <- f.slope(obss$t, obss$y)
sim_f1 <- f.slope(sim$t, sim$y)
lcs <- LCS(obs_f1, sim_f1)
lcs$QSI