R: Basic statistics of texts in the Brown corpus

BrownStats {corpora}

R Documentation

Basic statistics of texts in the Brown corpus

Description

This data set provides some basic quantiative measures for all texts in the Brown corpus of written American English (Francis & Kucera 1964),

Usage


BrownStats

Format

A data frame with 500 rows and the following columns:

ty:: number of distinct types
to:: number of tokens (including punctuation)
se:: number of sentences
towl:: mean word length in characters, averaged over tokens
tywl:: mean word length in characters, averaged over types

Author(s)

Marco Baroni <baroni@sslmit.unibo.it>

References