BrownStats {corpora}R Documentation

Basic statistics of texts in the Brown corpus

Description

This data set provides some basic quantiative measures for all texts in the Brown corpus of written American English (Francis \& Kucera 1964),

Usage


BrownStats

Format

A data frame with 500 rows and the following columns:

ty:

number of distinct types

to:

number of tokens (including punctuation)

se:

number of sentences

towl:

mean word length in characters, averaged over tokens

tywl:

mean word length in characters, averaged over types

Author(s)

Marco Baroni <baroni@sslmit.unibo.it>

References

Francis, W.~N. and Kucera, H. (1964). Manual of information to accompany a standard sample of present-day edited American English, for use with digital computers. Technical report, Department of Linguistics, Brown University, Providence, RI.

See Also

LOBStats


[Package corpora version 0.5-1 Index]