R: Naive Discriminative Learning

ndl-package {ndl}

R Documentation

Naive Discriminative Learning

Description

Naive discriminative learning implements learning and classification models based on the Rescorla-Wagner equations and their equilibrium equations.

Naive discriminative learning implements classification models based on the Rescorla-Wagner equations and the equilibrium equations of the Rescorla-Wagner equations. This package provides three kinds of functionality: (1) discriminative learning based directly on the Rescorla-Wagner equations, (2) a function implementing the naive discriminative reader, and a model for silent (single-word) reading, and (3) a classifier based on the equilibrium equations. The functions and datasets for the naive discriminative reader model make it possible to replicate the simulation results for Experiment 1 of Baayen et al. (2011). The classifier is provided to allow for comparisons between machine learning (svm, TiMBL, glm, random forests, etc.) and discrimination learning. Compared to standard classification algorithms, naive discriminative learning may overfit the data, albeit gracefully.

Details

The DESCRIPTION file:

Package:	ndl
Type:	Package
Title:	Naive Discriminative Learning
Version:	0.2.18
Date:	2018-09-09
Authors@R:	c(person("Antti Arppe", role = "aut", email = "arppe@ualberta.ca"), person("Peter Hendrix", role = "aut", email = "peter.hendrix@gmail.com"), person("Petar Milin", role = "aut", email = "pmilin@gmail.com"), person("R. Harald Baayen", role = "aut", email = "harald.baayen@uni-tuebingen.de"), person("Tino Sering", role = c("aut", "cre"), email = "konstantin.sering@uni-tuebingen.de"), person("Cyrus Shaoul", role = "aut", email = "cyrus@cyrus.org"))
Maintainer:	Tino Sering <konstantin.sering@uni-tuebingen.de>
Description:	Naive discriminative learning implements learning and classification models based on the Rescorla-Wagner equations and their equilibrium equations.
License:	GPL-3
Depends:	R (>= 3.0.2)
Imports:	Rcpp (>= 0.11.0), MASS, Hmisc
LinkingTo:	Rcpp
NeedsCompilation:	yes
Packaged:	2015-11-10 10:28:58 UTC; kfs-studium
RoxygenNote:	6.1.0
Author:	Antti Arppe [aut], Peter Hendrix [aut], Petar Milin [aut], R. Harald Baayen [aut], Tino Sering [aut, cre], Cyrus Shaoul [aut]

Index of help topics:

RescorlaWagner          Implementation of the Rescorla-Wagner
                        equations.
acts2probs              Calculate probability matrix from activation
                        matrix, as well as predicted values
anova.ndlClassify       Analysis of Model Fit for Naive Discriminatory
                        Reader Models
crosstableStatistics    Calculate statistics for a contingency table
cueCoding               code a vector of cues as n-grams
danks                   Example data from Danks (2003), after Spellman
                        (1996).
dative                  Dative Alternation
estimateActivations     Estimation of the activations of outcomes
                        (meanings)
estimateWeights         Estimation of the association weights using the
                        equilibrium equations of Danks (2003) for the
                        Rescorla-Wagner equations.
estimateWeightsCompact
                        Estimation of the association weights using the
                        equilibrium equations of Danks (2003) for the
                        Rescorla-Wagner equations using a compact
                        binary event file.
learn                   Count cue-outcome co-occurences needed to run
                        the Danks equations.
learnLegacy             Count cue-outcome co-occurrences needed to run
                        the Danks equations.
lexample                Lexical example data illustrating the
                        Rescorla-Wagner equations
modelStatistics         Calculate a range of goodness of fit measures
                        for an object fitted with some multivariate
                        statistical method that yields probability
                        estimates for outcomes.
ndl-package             Naive Discriminative Learning
ndlClassify             Classification using naive discriminative
                        learning.
ndlCrossvalidate        Crossvalidation of a Naive Discriminative
                        Learning model.
ndlCuesOutcomes         Creation of dataframe for Naive Discriminative
                        Learning from formula specification
ndlStatistics           Calculate goodness of fit statistics for a
                        naive discriminative learning model.
ndlVarimp               Permutation variable importance for
                        classification using naive discriminative
                        learning.
numbers                 Example data illustrating the Rescorla-Wagner
                        equations as applied to numerical cognition by
                        Ramscar et al. (2011).
orthoCoding             Code a character string (written word form) as
                        letter n-grams
plot.RescorlaWagner     Plot function for the output of
                        'RescorlaWagner'.
plot.ndlClassify        Plot function for selected results of
                        'ndlClassify'.
plurals                 Artificial data set used to illustrate the
                        Rescorla-Wagner equations and naive
                        discriminative learning.
predict.ndlClassify     Predict method for ndlClassify objects
random.pseudoinverse    Calculate an approximation of the pseudoinverse
                        of a matrix.
serbian                 Serbian case inflected nouns.
serbianLex              Serbian lexicon with 1187 prime-target pairs.
serbianUniCyr           Serbian case inflected nouns (in Cyrillic
                        Unicode).
serbianUniLat           Serbian case inflected nouns (in Latin-alphabet
                        Unicode).
summary.ndlClassify     A summary of a Naive Discriminatory Learning
                        Model
summary.ndlCrossvalidate
                        A summary of a crossvalidation of a Naive
                        Discriminatory Reader Model
think                   Finnish 'think' verbs.

For more detailed information on the core Rescorla-Wagner equations, see the functions RescorlaWagner and plot.RescorlaWagner, as well as the data sets danks, numbers (data courtesy of Michael Ramscar), and lexample (an example discussed in Baayen et al. 2011).

The functions for the naive discriminative learning (at the user level) are estimateWeights and estimateActivations. The relevant data sets are serbian, serbianUniCyr,serbianUniLat, and serbianLex. The examples for serbianLex present the full simulation for Experiment 1 of Baayen et al. (2011).

Key functionality for the user is provided by the functions orthoCoding, estimateWeights, and estimateActivations. orthoCoding calculates the letter n-grams for character strings, to be used as cues. It is assumed that meaning or meanings (separated by underscores if there are more then one) are available as outcomes. The frequency with which each (unique) combination of cues and outcomes occurs are required. For some example input data sets, see: danks, plurals, serbian, serbianUniCyr and serbianUniLat.

The function estimateWeights estimates the association strengths of cues to outcomes, using the equilibrium equations presented in Danks (2003). The function estimateActivations estimates the activations of outcomes (meanings) given cues (n-grams).

The Rcpp-based learn and learnLegacy functions use a C++ function to compute the conditional co-occurrence matrices required in the equilibrium equations. These are internally used by estimateWeights and should not be used directly by users of the package.

The key function for naive discriminative classification is ndlClassify; see data sets think and dative for examples.

Author(s)

Maintainer: Tino Sering <konstantin.sering@uni-tuebingen.de>

Author Contributions: Initial concept by R. Harald Baayen with contributions from Petar Milin and Peter Hendrix. First R coding done by R. Harald Baayen.

Initial R package development until version 0.1.6 by Antti Arppe. Initial documentation by Antti Arppe. Initial optimizations in C by Petar Milin and Antti Arppe.

Classification functionality developed further by Antti Arppe.

In version 0.2.14 to version 0.2.16, improvements to the NDL algorithm by Petar Milin and Cyrus Shaoul. In version 0.2.14 to version 0.2.16, improved performance optimizations (C++ and Rcpp) by Cyrus Shaoul.

From version 0.2.17 onwards bug fixes and cran compliance by Tino Sering.

References

Baayen, R. H. and Milin, P. and Filipovic Durdevic, D. and Hendrix, P. and Marelli, M., An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological Review, 118, 438-482.

Baayen, R. H. (2011) Corpus linguistics and naive discriminative learning. Brazilian Journal of Applied Linguistics, 11, 295-328.

Arppe, A. and Baayen, R. H. (in prep.) Statistical classification and principles of human learning.

Examples

## Not run: 
# Rescorla-Wagner
data(lexample)

lexample$Cues <- orthoCoding(lexample$Word, grams=1)
lexample.rw <- RescorlaWagner(lexample, nruns=25, traceCue="h",
   traceOutcome="hand")
plot(lexample.rw)
mtext("h - hand", 3, 1)

data(numbers)

traceCues <- c( "exactly1", "exactly2", "exactly3", "exactly4", "exactly5",
   "exactly6", "exactly7", "exactly10", "exactly15")
traceOutcomes <- c("1", "2", "3", "4", "5", "6", "7", "10", "15")

ylimit <- c(0,1)
par(mfrow=c(3,3), mar=c(4,4,1,1))

for (i in 1:length(traceCues)) {
  numbers.rw <- RescorlaWagner(numbers, nruns=1, traceCue=traceCues[i],
     traceOutcome=traceOutcomes[i])
  plot(numbers.rw, ylimit=ylimit)
  mtext(paste(traceCues[i], " - ", traceOutcomes[i], sep=""), side=3, line=-1,
    cex=0.7)
}
par(mfrow=c(1,1))

# naive discriminative learning (for complete example, see serbianLex)
# This function uses a Unicode dataset.
data(serbianUniCyr)
serbianUniCyr$Cues <- orthoCoding(serbianUniCyr$WordForm, grams=2)
serbianUniCyr$Outcomes <- serbianUniCyr$LemmaCase
sw <- estimateWeights(cuesOutcomes=serbianUniCyr,hasUnicode=T)

desiredItems <- unique(serbianUniCyr["Cues"])
desiredItems$Outcomes=""
activations <- estimateActivations(desiredItems, sw)$activationMatrix
rownames(activations) <- unique(serbianUniCyr[["WordForm"]])

syntax <- c("acc", "dat", "gen", "ins", "loc", "nom", "Pl",  "Sg") 
activations2 <- activations[,!is.element(colnames(activations), syntax)]
head(rownames(activations2),50)
head(colnames(activations2),8)

image(activations2, xlab="word forms", ylab="meanings", xaxt="n", yaxt="n")
mtext(c("yena", "...", "zvuke"), side=1, line=1, at=c(0, 0.5, 1),  adj=c(0,0,1))
mtext(c("yena", "...", "zvuk"), side=2, line=1, at=c(0, 0.5, 1),   adj=c(0,0,1))

# naive discriminative classification
data(think)
think.ndl <- ndlClassify(Lexeme ~ Person + Number + Agent + Patient + Register,
   data=think)
summary(think.ndl)
plot(think.ndl, values="weights", type="hist", panes="multiple")
plot(think.ndl, values="probabilities", type="density")

## End(Not run)

[Package ndl version 0.2.18 Index]