R: Calculate a range of goodness of fit measures for an object...

modelStatistics {ndl}

R Documentation

Calculate a range of goodness of fit measures for an object fitted with some multivariate statistical method that yields probability estimates for outcomes.

Description

modelStatistics calculates a range of goodness of fit measures.

Usage

  modelStatistics(observed, predicted, frequency=NA, p.values,
     n.data, n.predictors, outcomes=levels(as.factor(observed)),
     p.normalize=TRUE, cross.tabulation=TRUE, 
     p.zero.correction=1/(NROW(p.values)*NCOL(p.values))^2)

Arguments

`observed`	observed values of the response variable
`predicted`	predicted values of the response variable; typically the outcome estimated to have the highest probability
`frequency`	frequencies of observed and predicted values; if `NA`, frequencies equal to 1 for all observed and predicted values
`p.values`	matrix of probabilities for all values of the response variable (i.e outcomes)
`n.data`	sum frequency of data points in model
`n.predictors`	number of predictor levels in model
`outcomes`	a vector with the possible values of the response variable
`p.normalize`	if `TRUE`, probabilities are normalized so that `sum(P)` of all outcomes for each datapoint is equal to 1
`cross.tabulation`	if `TRUE`, statistics on the crosstabulation of observed and predicted response values are calculated with `crosstableStatistics`
`p.zero.correction`	a function to adjust slightly response/outcome-specific probability estimates which are exactly P=0; necessary for the proper calculation of pseudo-R-squared statistics; by default calculated on the basis of the dimensions of the matrix of probabilities `p.values`.

Value

A list with the following components:

loglikelihood.null: Loglikelihood for null model
loglikelihood.model: Loglikelihood for fitted model
deviance.null: Null deviance
deviance.model: Model deviance
R2.likelihood: (McFadden's) R-squared
R2.nagelkerke: Nagelkerke's R-squared
AIC.model: Akaike's Information Criterion
BIC.model: Bayesian Information Criterion
C: index of concordance C (for binary response variables only)
crosstable: Crosstabulation of observed and predicted outcomes, if cross.tabulation=TRUE
crosstableStatistics(crosstable): Various statistics calculated on crosstable with crosstableStatistics, if cross.tabulation=TRUE

Author(s)

Antti Arppe and Harald Baayen

References

Arppe, A. 2008. Univariate, bivariate and multivariate methods in corpus-based lexicography – a study of synonymy. Publications of the Department of General Linguistics, University of Helsinki, No. 44. URN: http://urn.fi/URN:ISBN:978-952-10-5175-3.

Arppe, A., and Baayen, R. H. (in prep.) Statistical modeling and the principles of human learning.

Hosmer, David W., Jr., and Stanley Lemeshow 2000. Applied Regression Analysis (2nd edition). New York: Wiley.

Examples

data(think)
think.ndl <- ndlClassify(Lexeme ~ Agent + Patient, data=think)
probs <- acts2probs(think.ndl$activationMatrix)$p
preds <- acts2probs(think.ndl$activationMatrix)$predicted
n.data <- nrow(think)
n.predictors <- nrow(think.ndl$weightMatrix) *
   ncol(think.ndl$weightMatrix)
modelStatistics(observed=think$Lexeme, predicted=preds, p.values=probs,
   n.data=n.data, n.predictors=n.predictors)

[Package ndl version 0.2.18 Index]