linguistic.relatedness {cluster.datasets}R Documentation

Hartigan (1975) Relatedness Values of Selected Words

Description

Frequencies with which a pair is judged more highly related than other pairs, over many triads and subjects. This is Table 10.4 in Chapter 10 of Hartigan (1975) on page 184.

Usage

data(linguistic.relatedness)

Format

A data frame with 6 observations on the following 7 variables.

word

a character vector for the

the

a numeric vector for the frequency with which words are related to 'the'

boy

a numeric vector for the frequency with which words are related to 'boy'

has

a numeric vector for the frequency with which words are related to 'has'

lost

a numeric vector for the frequency with which words are related to 'lost'

a

a numeric vector for the frequency with which words are related to 'a'

dollar

a numeric vector for the frequency with which words are related to 'dollar'

Details

This is an unusual data set to be used with the triads-leader algorithm.

Source

Levelt, W. J. M (1967). Psychological representations of syntactic structures, in The Structure and Psychology of Language, T. G. Bever and W. Weksel, eds, Holt, Rinehart and Winston, New York.

SPAETH2 Cluster Analysis Datasets http://people.sc.fsu.edu/~jburkardt/datasets/spaeth2/spaeth2.html

References

Hartigan, J. A. (1975). Clustering Algorithms, John Wiley, New York.

Examples

data(linguistic.relatedness)

[Package cluster.datasets version 1.0-1 Index]