BNCdomains {corpora} R Documentation

## Distribution of domains in the British National Corpus (BNC)

### Description

This data set gives the number of documents and tokens in each of the 18 domains represented in the British National Corpus, World Edition (BNC). See Aston & Burnard (1998) for more information about the BNC and the domain classification, or go to http://www.natcorp.ox.ac.uk/.

### Usage


BNCdomains



### Format

A data frame with 19 rows and the following columns:

domain:

name of the respective domain in the BNC

documents:

number of documents from this domain

tokens:

total number of tokens in all documents from this domain

### Details

For one document in the BNC, the domain classification is missing. This document is represented by the code Unlabeled in the data set.

### Author(s)

Marco Baroni <baroni@sslmit.unibo.it>

### References

Aston, Guy and Burnard, Lou (1998). The BNC Handbook. Edinburgh University Press, Edinburgh. See also the BNC homepage at http://www.natcorp.ox.ac.uk/.

[Package corpora version 0.5-1 Index]