ndl-package {ndl} | R Documentation |
Naive Discriminative Learning
Description
Naive discriminative learning implements learning and classification models based on the Rescorla-Wagner equations and their equilibrium equations.
Naive discriminative learning implements classification models based on the Rescorla-Wagner equations and the equilibrium equations of the Rescorla-Wagner equations. This package provides three kinds of functionality: (1) discriminative learning based directly on the Rescorla-Wagner equations, (2) a function implementing the naive discriminative reader, and a model for silent (single-word) reading, and (3) a classifier based on the equilibrium equations. The functions and datasets for the naive discriminative reader model make it possible to replicate the simulation results for Experiment 1 of Baayen et al. (2011). The classifier is provided to allow for comparisons between machine learning (svm, TiMBL, glm, random forests, etc.) and discrimination learning. Compared to standard classification algorithms, naive discriminative learning may overfit the data, albeit gracefully.
Details
The DESCRIPTION file:
Package: | ndl |
Type: | Package |
Title: | Naive Discriminative Learning |
Version: | 0.2.18 |
Date: | 2018-09-09 |
Authors@R: | c(person("Antti Arppe", role = "aut", email = "arppe@ualberta.ca"), person("Peter Hendrix", role = "aut", email = "peter.hendrix@gmail.com"), person("Petar Milin", role = "aut", email = "pmilin@gmail.com"), person("R. Harald Baayen", role = "aut", email = "harald.baayen@uni-tuebingen.de"), person("Tino Sering", role = c("aut", "cre"), email = "konstantin.sering@uni-tuebingen.de"), person("Cyrus Shaoul", role = "aut", email = "cyrus@cyrus.org")) |
Maintainer: | Tino Sering <konstantin.sering@uni-tuebingen.de> |
Description: | Naive discriminative learning implements learning and classification models based on the Rescorla-Wagner equations and their equilibrium equations. |
License: | GPL-3 |
Depends: | R (>= 3.0.2) |
Imports: | Rcpp (>= 0.11.0), MASS, Hmisc |
LinkingTo: | Rcpp |
NeedsCompilation: | yes |
Packaged: | 2015-11-10 10:28:58 UTC; kfs-studium |
RoxygenNote: | 6.1.0 |
Author: | Antti Arppe [aut], Peter Hendrix [aut], Petar Milin [aut], R. Harald Baayen [aut], Tino Sering [aut, cre], Cyrus Shaoul [aut] |
Index of help topics:
RescorlaWagner Implementation of the Rescorla-Wagner equations. acts2probs Calculate probability matrix from activation matrix, as well as predicted values anova.ndlClassify Analysis of Model Fit for Naive Discriminatory Reader Models crosstableStatistics Calculate statistics for a contingency table cueCoding code a vector of cues as n-grams danks Example data from Danks (2003), after Spellman (1996). dative Dative Alternation estimateActivations Estimation of the activations of outcomes (meanings) estimateWeights Estimation of the association weights using the equilibrium equations of Danks (2003) for the Rescorla-Wagner equations. estimateWeightsCompact Estimation of the association weights using the equilibrium equations of Danks (2003) for the Rescorla-Wagner equations using a compact binary event file. learn Count cue-outcome co-occurences needed to run the Danks equations. learnLegacy Count cue-outcome co-occurrences needed to run the Danks equations. lexample Lexical example data illustrating the Rescorla-Wagner equations modelStatistics Calculate a range of goodness of fit measures for an object fitted with some multivariate statistical method that yields probability estimates for outcomes. ndl-package Naive Discriminative Learning ndlClassify Classification using naive discriminative learning. ndlCrossvalidate Crossvalidation of a Naive Discriminative Learning model. ndlCuesOutcomes Creation of dataframe for Naive Discriminative Learning from formula specification ndlStatistics Calculate goodness of fit statistics for a naive discriminative learning model. ndlVarimp Permutation variable importance for classification using naive discriminative learning. numbers Example data illustrating the Rescorla-Wagner equations as applied to numerical cognition by Ramscar et al. (2011). orthoCoding Code a character string (written word form) as letter n-grams plot.RescorlaWagner Plot function for the output of 'RescorlaWagner'. plot.ndlClassify Plot function for selected results of 'ndlClassify'. plurals Artificial data set used to illustrate the Rescorla-Wagner equations and naive discriminative learning. predict.ndlClassify Predict method for ndlClassify objects random.pseudoinverse Calculate an approximation of the pseudoinverse of a matrix. serbian Serbian case inflected nouns. serbianLex Serbian lexicon with 1187 prime-target pairs. serbianUniCyr Serbian case inflected nouns (in Cyrillic Unicode). serbianUniLat Serbian case inflected nouns (in Latin-alphabet Unicode). summary.ndlClassify A summary of a Naive Discriminatory Learning Model summary.ndlCrossvalidate A summary of a crossvalidation of a Naive Discriminatory Reader Model think Finnish 'think' verbs.
For more detailed information on the core Rescorla-Wagner equations, see
the functions RescorlaWagner
and
plot.RescorlaWagner
, as well as the data sets
danks
, numbers
(data courtesy of Michael
Ramscar), and lexample
(an example discussed in Baayen et
al. 2011).
The functions for the naive discriminative learning (at the user level)
are estimateWeights
and
estimateActivations
. The relevant data sets are
serbian
, serbianUniCyr,serbianUniLat, and
serbianLex
. The examples for serbianLex
present the full simulation for Experiment 1 of Baayen et al. (2011).
Key functionality for the user is provided by the functions
orthoCoding
, estimateWeights
, and
estimateActivations
. orthoCoding
calculates the letter
n-grams for character strings, to be used as cues. It is assumed that
meaning or meanings (separated by underscores if there are more then
one) are available as outcomes. The frequency with which each (unique)
combination of cues and outcomes occurs are required. For some example
input data sets, see: danks
, plurals
,
serbian
, serbianUniCyr
and
serbianUniLat
.
The function estimateWeights
estimates the association
strengths of cues to outcomes, using the equilibrium equations presented
in Danks (2003). The function estimateActivations
estimates the
activations of outcomes (meanings) given cues (n-grams).
The Rcpp-based learn
and learnLegacy
functions use a C++ function to compute the conditional co-occurrence
matrices required in the equilibrium equations. These are internally
used by estimateWeights
and should not be used directly by users
of the package.
The key function for naive discriminative classification is
ndlClassify
; see data sets think
and
dative
for examples.
Author(s)
NA
Maintainer: Tino Sering <konstantin.sering@uni-tuebingen.de>
Author Contributions: Initial concept by R. Harald Baayen with contributions from Petar Milin and Peter Hendrix. First R coding done by R. Harald Baayen.
Initial R package development until version 0.1.6 by Antti Arppe. Initial documentation by Antti Arppe. Initial optimizations in C by Petar Milin and Antti Arppe.
Classification functionality developed further by Antti Arppe.
In version 0.2.14 to version 0.2.16, improvements to the NDL algorithm by Petar Milin and Cyrus Shaoul. In version 0.2.14 to version 0.2.16, improved performance optimizations (C++ and Rcpp) by Cyrus Shaoul.
From version 0.2.17 onwards bug fixes and cran compliance by Tino Sering.
References
Baayen, R. H. and Milin, P. and Filipovic Durdevic, D. and Hendrix, P. and Marelli, M., An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological Review, 118, 438-482.
Baayen, R. H. (2011) Corpus linguistics and naive discriminative learning. Brazilian Journal of Applied Linguistics, 11, 295-328.
Arppe, A. and Baayen, R. H. (in prep.) Statistical classification and principles of human learning.
Examples
## Not run:
# Rescorla-Wagner
data(lexample)
lexample$Cues <- orthoCoding(lexample$Word, grams=1)
lexample.rw <- RescorlaWagner(lexample, nruns=25, traceCue="h",
traceOutcome="hand")
plot(lexample.rw)
mtext("h - hand", 3, 1)
data(numbers)
traceCues <- c( "exactly1", "exactly2", "exactly3", "exactly4", "exactly5",
"exactly6", "exactly7", "exactly10", "exactly15")
traceOutcomes <- c("1", "2", "3", "4", "5", "6", "7", "10", "15")
ylimit <- c(0,1)
par(mfrow=c(3,3), mar=c(4,4,1,1))
for (i in 1:length(traceCues)) {
numbers.rw <- RescorlaWagner(numbers, nruns=1, traceCue=traceCues[i],
traceOutcome=traceOutcomes[i])
plot(numbers.rw, ylimit=ylimit)
mtext(paste(traceCues[i], " - ", traceOutcomes[i], sep=""), side=3, line=-1,
cex=0.7)
}
par(mfrow=c(1,1))
# naive discriminative learning (for complete example, see serbianLex)
# This function uses a Unicode dataset.
data(serbianUniCyr)
serbianUniCyr$Cues <- orthoCoding(serbianUniCyr$WordForm, grams=2)
serbianUniCyr$Outcomes <- serbianUniCyr$LemmaCase
sw <- estimateWeights(cuesOutcomes=serbianUniCyr,hasUnicode=T)
desiredItems <- unique(serbianUniCyr["Cues"])
desiredItems$Outcomes=""
activations <- estimateActivations(desiredItems, sw)$activationMatrix
rownames(activations) <- unique(serbianUniCyr[["WordForm"]])
syntax <- c("acc", "dat", "gen", "ins", "loc", "nom", "Pl", "Sg")
activations2 <- activations[,!is.element(colnames(activations), syntax)]
head(rownames(activations2),50)
head(colnames(activations2),8)
image(activations2, xlab="word forms", ylab="meanings", xaxt="n", yaxt="n")
mtext(c("yena", "...", "zvuke"), side=1, line=1, at=c(0, 0.5, 1), adj=c(0,0,1))
mtext(c("yena", "...", "zvuk"), side=2, line=1, at=c(0, 0.5, 1), adj=c(0,0,1))
# naive discriminative classification
data(think)
think.ndl <- ndlClassify(Lexeme ~ Person + Number + Agent + Patient + Register,
data=think)
summary(think.ndl)
plot(think.ndl, values="weights", type="hist", panes="multiple")
plot(think.ndl, values="probabilities", type="density")
## End(Not run)