modelStatistics {ndl}R Documentation

Calculate a range of goodness of fit measures for an object fitted with some multivariate statistical method that yields probability estimates for outcomes.

Description

modelStatistics calculates a range of goodness of fit measures.

Usage

  modelStatistics(observed, predicted, frequency=NA, p.values,
     n.data, n.predictors, outcomes=levels(as.factor(observed)),
     p.normalize=TRUE, cross.tabulation=TRUE, 
     p.zero.correction=1/(NROW(p.values)*NCOL(p.values))^2)

Arguments

observed

observed values of the response variable

predicted

predicted values of the response variable; typically the outcome estimated to have the highest probability

frequency

frequencies of observed and predicted values; if NA, frequencies equal to 1 for all observed and predicted values

p.values

matrix of probabilities for all values of the response variable (i.e outcomes)

n.data

sum frequency of data points in model

n.predictors

number of predictor levels in model

outcomes

a vector with the possible values of the response variable

p.normalize

if TRUE, probabilities are normalized so that sum(P) of all outcomes for each datapoint is equal to 1

cross.tabulation

if TRUE, statistics on the crosstabulation of observed and predicted response values are calculated with crosstableStatistics

p.zero.correction

a function to adjust slightly response/outcome-specific probability estimates which are exactly P=0; necessary for the proper calculation of pseudo-R-squared statistics; by default calculated on the basis of the dimensions of the matrix of probabilities p.values.

Value

A list with the following components:

loglikelihood.null

Loglikelihood for null model

loglikelihood.model

Loglikelihood for fitted model

deviance.null

Null deviance

deviance.model

Model deviance

R2.likelihood

(McFadden's) R-squared

R2.nagelkerke

Nagelkerke's R-squared

AIC.model

Akaike's Information Criterion

BIC.model

Bayesian Information Criterion

C

index of concordance C (for binary response variables only)

crosstable

Crosstabulation of observed and predicted outcomes, if cross.tabulation=TRUE

crosstableStatistics(crosstable)

Various statistics calculated on crosstable with crosstableStatistics, if cross.tabulation=TRUE

Author(s)

Antti Arppe and Harald Baayen

References

Arppe, A. 2008. Univariate, bivariate and multivariate methods in corpus-based lexicography – a study of synonymy. Publications of the Department of General Linguistics, University of Helsinki, No. 44. URN: http://urn.fi/URN:ISBN:978-952-10-5175-3.

Arppe, A., and Baayen, R. H. (in prep.) Statistical modeling and the principles of human learning.

Hosmer, David W., Jr., and Stanley Lemeshow 2000. Applied Regression Analysis (2nd edition). New York: Wiley.

See Also

See also ndlClassify, ndlStatistics, crosstableStatistics.

Examples

data(think)
think.ndl <- ndlClassify(Lexeme ~ Agent + Patient, data=think)
probs <- acts2probs(think.ndl$activationMatrix)$p
preds <- acts2probs(think.ndl$activationMatrix)$predicted
n.data <- nrow(think)
n.predictors <- nrow(think.ndl$weightMatrix) *
   ncol(think.ndl$weightMatrix)
modelStatistics(observed=think$Lexeme, predicted=preds, p.values=probs,
   n.data=n.data, n.predictors=n.predictors)

[Package ndl version 0.2.18 Index]